Archive Class¶
The Archive class is the primary mechanism for opening PSRFITS files.
- class Archive(filename[, prepare=True, lowmem=False, verbose=True, weight=True, center_pulse=True, baseline_removal=True, wcfreq=True, thread=False, onlyheader=False])¶
- Parameters
prepare (bool) – Argument passed to
load()
. IfTrue
, then the file will be automatically polarization averaged withpscrunch()
, dedispersed withdedisperse()
and using the weighted center frequency if the parameterwcfreq
is set toTrue
, and centered withcenter()
ifcenter_pulse
is set toTrue
.lowmem (bool) – Argument passed to
load()
. IfTrue
, then the PSRFITS file is opened in memmap mode and the data arrays are also replaced with memmaps.verbose (bool) – Print extra information on loading and processing.
weight (bool) – Argument passed to
load()
. Use the stored data weights, which is the typical mode.center_pulse (bool) – Argument passed to
load()
. IfTrue
, then the peak of the pulse is centered in the middle of the data arrays. This is preferred for plotting purposes but the resulting arrival-time shifts are computed and stored internally.baseline_removal (bool) – Argument passed to
load()
. Subtracts the baseline intensity of the average profile off-pulse region from all individual data profiles usingremove_baseline()
.wcfreq (bool) – Argument passed to
load()
. IfTrue
, then the weighted center frequency is used indedisperse()
ifprepare=True
.thread (bool) – Argument passed to
load()
. IfTrue
, then the calculation of the data array will be parallelized, which can lead to some speed-up for large data files but will take longer for small data files given the extra overhead required to start the process.onlyheader (bool) – Argument passed to
load()
. IfTrue
, then only the primary and table headers are processed, without the data array. This is much faster if you only need access to metadata.
Usage:
ar = Archive(FILENAME) #loads archive, dedispersed and polarization averaged by default
ar.tscrunch() #averages the pulse in time
data = ar.getData() #returns the numpy data array for use by you
ar.imshow() #plots frequency vs phase for the pulses
Description of Data¶
From Appendix A.1 of the thesis Lam 2016:
The primary data array of profiles in a PSRFITS file is given by \(\mathcal{I}(t,\mathrm{pol},\nu,\phi)\), the pulse intensity as a function of time \(t\), polarization \(\mathrm{pol}\), frequency \(\nu\), phase \(\phi\), where the arguments are in the order of the array dimensions. To save memory, intensity data are stored in multiple arrays. The raw data array (DATA) \(d\) is the largest in dimensionality but for folded pulse data is typically stored as an array of 16-bit integers. To retrieve the raw data value for each pulse profile, the data array is then multiplied by a scale array (DAT_SCL) \(s\) and an offset array (DAT_OFFS) \(o\) is added. An array of weights (DAT_WTS) \(w\) is also stored internally and typically modifies the raw data, e.g., when excising radio frequency interference. The three modifier arrays are of much smaller size than the data array and are typically stored as in 32-bit single-precision float format. Mathematically, the resultant array of pulse intensities can be written as
PSRFITS files also contain a wide range of additional information stored internally, including a history of all PSRCHIVE modifications to the file, a folding ephemeris, and a large global header of useful metadata. Besides the data array, PyPulse will unpack and store all extra information for retrieval via get() methods as desired.
Methods¶
- load(filename[, prepare=True, center_pulse=True, baseline_removal=True, weight=True, wcfreq=True, onlyheader=False])¶
Load a PSRFITS file, process the metadata, and form the data arrays. This is called internally by
__init__()
.- Parameters
filename (str) – Path to load file from.
prepare (bool) – This performs three tasks. It will polarization average the data via
pscrunch()
, dedisperse the data withdedisperse()
, and rotate the pulse so that the peak is in the center of phase withcenter()
. For centering, this will store the relevant time delays associated with the rotation.center_pulse (bool) – The peak of the pulse is centered in the middle of the data arrays. This is preferred for plotting purposes but the resulting arrival-time shifts are computed and stored internally.
baseline_removal (bool) – Subtract the baseline intensity of the average profile off-pulse region from all individual data profiles.
weight (bool) – Use the stored data weights, which is the typical mode.
wcfreq (bool) – The weighted center frequency is used in
dedisperse()
ifprepare=True
.onlyheader (bool) – Ohe primary and table headers are processed, without the data array. This is much faster if you only need access to metadata.
- Returns
None
- save(filename)¶
Save the data to a new PSRFITS file.
- Parameters
filename (str) – Path to save file to.
Warning
save()
will output a PSRFITS file but the output data arrays vary slightly from the input data arrays. More
- gc()¶
Manually clear the data cube and weights for Python garbage collection
- shape([squeeze=True])¶
Return the shape of the data array.
- Parameters
squeeze (bool) – Return the shape of the data array when dimension of length 1 are removed.
- Returns
shape, tuple of integers
- reset([prepare=True])¶
Replace the data with the original clone, preventing full reloading. Useful for larger files but only if the lowmem flag is set to True.
- scrunch([arg='Dp', **kwargs])¶
Average the data cube along different axes.
- Parameters
arg (str) – Can be T for
tscrunch()
, p forpscrunch()
, F forfscrunch()
, B forbscrunch()
, and D fordedisperse()
, following the PSRCHIVE conventions.- Returns
self
- tscrunch([nsubint=None, factor=None])¶
Perform a weighted average the data cube along the time dimension.
- pscrunch()¶
Perform an average the data cube along the polarization dimension. Can handle data in Coherence (AABBCRCI) or Stokes (IQUV) format.
- Returns
self
Todo
Perform a weighted average of the data cube
- fscrunch([nchan=None, factor=None])¶
Perform a weighted average the data cube along the frequency dimension
- bscrunch([nbins=None, factor=None])¶
Perform an average the data cube along the phase (bin) dimension.
Todo
Perform a weighted average of the data cube
- dedisperse([DM=None, reverse=False, wcfreq=False])¶
Dedisperse the pulses by introducing the appropriate time delays and rotating in phase.
- dededisperse([DM=None, wcfreq=False])¶
Runs
dedisperse()
with reverse=False flag. See that function for parameter notation.
- calculateAverageProfile()¶
Calculate the average profile by performing an unweighted average along each dimension Automatically calls
calculateOffpulseWindow()
.
Todo
Perform a weigthed average.
- calculateOffpulseWindow()¶
Calculate an off-pulse window using the
SinglePulse
, with the windowsize parameter equal to one-eighth the number of phase bins.
- center([phase_offset=0.5])¶
Center the peak of the pulse in the middle of the data arrays.
- Parameters
phase_offset (float) – Determine the phase offset (in [0,1]) of the peak, i.e., impose an arbitrary rotation to where the center of the peak should fall.
- Returns
self
- removeBaseline()¶
Removes the baseline of the pulses given the off-pulse window of the average pulse profile pre-calculated by
calculateAverageProfile()
:return: self
- remove_baseline()¶
See
removeBaseline()
.
- getLevels([differences=False])¶
Returns calibration levels if the Archive is a calibrator in the form of a square wave signal. If differences is set to True, then this function will return the frequencies, the amplitude differences in the height of the square wave as a function of polarization/frequency, and the associated errors. If False, then it will return the frequencies, the mean values of the low and high portions of the square wave and the associated errors.
- Parameters
differences (bool) –
- getPulsarCalibrator()¶
Uses
getLevels()
to get aCalibrator
object with associated metadata- Return type
- calibrate(psrcal[, fluxcal=None])¶
Polarization calibrates the data using another archive file. Flux calibration optional.
Warning
This function is under construction.
- getData([squeeze=True, setnan=None, weight=True])¶
Return the data array.
- setData(newdata)¶
Replaces the data array with new data. Must be the same shape.
- Parameters
newdata (numpy.ndarray) – New data array.
- getWeights([squeeze=True])¶
Return a copy of the weights array.
- Parameters
squeeze (bool) – All dimensions of length 1 are removed.
- setWeights(val[, t=None, f=None])¶
Set weights to a certain value. Can be used for RFI-excision routines.
- saveData([filename=None, ext='npy', ascii=False])¶
Save the data array to a different format. Default is to save to a numpy binary file (.npy).
- Parameters
filename (str) – Filename to save the data to. If none, save to the archive’s original filename after replacing the extension with
ext
.ext (str) – Filename extension
ascii (bool) – Save the data to to a text file. If all four dimensions have length greater than 1, the data are saved in time, polarization, frequency, and phase order, with intensity as the fifth column. Otherwise, use numpy’s
savetxt()
to output the array.
- outputPulses(filename)¶
Write out a standard .npy file by calling
saveData()
.- Parameters
filename (str) – Filename to save the data to.
- getAxis([flag=None, edges=False, wcfreq=False])¶
Get the time or frequency axes for plotting.
- Parameters
- Return type
numpy.ndarray
Todo
Let flag be both “T” and “F”.
- getFrequencies()¶
Convenience function for
getAxis('F')()
- getFreqs()¶
See
getFrequencies()
.
- getTimes()¶
Convenience function for
getAxis('T')()
- getPulse(t[, f=None])¶
Get the pulse shape as a function of time and potentially frequency if provided. Assumes the shape of the data is polarization averaged.
Todo
Do not assume polarization averaging.
- getPeakFlux(t[, f=None])¶
Return the maximum value of the pulses, with parameters passed to
getPulse()
- getIntegratedFlux(t[, f=None])¶
Return the integrated value of the pulses, with parameters passed to
getPulse()
- getSinglePulses([func=None, windowsize=None, **kwargs])¶
Efficiently wrap the data array with
SinglePulse
.- Parameters
func (function) – Arbitrary function to map onto the data array.
windowsize (int) – Parameter passed to
SinglePulse
that describes the off-pulse window length**kwargs – Additional parameters passed to
SinglePulse
- Return type
numpy.ndarray of type np.object
- fitPulses(template,[nums=[0,1,2,3,4,5,6],flatten=False,func=None,windowsize=None,**kwargs])¶
Fit all of the pulses with a given template shape.
- Parameters
template (list/numpy.ndarray) – Template shape
nums (list/numpy.ndarray) – Numbers that denote which return values from
fitPulse()
fromSinglePulse
. Example: to return only TOA values, use nums=[1]. For TOA values and scale factors, use nums=[1,3]. Defaults to all values.flatten (bool) – Flatten the data array.
func (function) – Arbitrary function to map onto the data array.
windowsize (int) – Parameter passed to
SinglePulse
that describes the off-pulse window length**kwargs – Additional parameters passed to
SinglePulse
- getDynamicSpectrum([window=None, template=None, mpw=None, align=None, windowsize=None, verbose=False, snr=False, maketemplate=True])¶
Return the dynamic spectrum.
- Parameters
window (numpy.ndarray) – Return the dynamic spectrum using only certain phase bins.
template (list/numpy.ndarray) – Generate the dynamic spectrum using the scale factor from template matching. Otherwise simply sum along the phase axis.
mpw (list/numpy.ndarray) – Main-pulse window if calculating the dynamic spectrum using a template. Required if a template is provided.
align (float) – Parameter passed to
SinglePulse
that describe a rotation of the pulse.windowsize (int) – Parameter passed to
SinglePulse
that describes the off-pulse window lengthverbose (bool) – Print the time index as each template is fit.
snr (bool) – Instead of the scale factors, return the signal-to-noise ratios.
maketemplate (bool) – Instead of supplying a template, make a basic smoothed one from the average pulse for matched filtering.
Warning
return values are not well-defined. Can either return the dynamic spectra, or will return a tuple of the scale factors, offsets, and errors of the template fit.
- plot([ax=None, show=True])¶
Basic plotter of the data, if the data array can be reduced to one dimension.
- Parameters
ax (matplotlib.axes._subplots.AxesSubplot) – Provide a matplotlib axis to plot to.
show (bool) – Generate a matplotlib plot display.
- imshow([ax=None, cbar=False, mask=None, show=True, **kwargs])¶
Basic plotter of the data, if the data array can be reduced to two dimensions. The origin is set to the lower left.
- Parameters
ax (matplotlib.axes._subplots.AxesSubplot) – Provide a matplotlib axis to plot to.
cbar (bool) – Include a matplotlib colorbar.
mask (numpy.ndarray) – Apply a mask array using the conventions of a numpy masked array (numpy.ma.core.MaskedArray)
show (bool) – Generate a matplotlib plot display.
**kwargs – Additional arguments to pass to imshow.
- pavplot([ax=None, mode='GTpd', show=True, wcfreq=True])¶
Produces a PSRCHIVE pav-like plot for comparison
- Parameters
ax (matplotlib.axes._subplots.AxesSubplot) – Provide a matplotlib axis to plot to.
- waterfall([offset=None, border=0, labels=True, album=False, bins=None, show=True])¶
Produce a waterfall plot if the data array can be reduced to two dimensions.
- joyDivision([border=0.1, labels=False, album=True, **kwargs])¶
Calls
waterfall()
in the style of the Joy Division album cover. All parameters are passed to the function.
- time(template, filename[, MJD=False, wcfreq=False, **kwargs])¶
Calculate times-of-arrival (TOAs).
- Parameters
template (list/numpy.ndarray/Archive) – Template shape to fit to the pulses.
filename (str) – Path to save text to. If filename=None, print the text.
MJD (bool) – Calculate absolute TOAs in MJD units instead of relative TOAs in bin (time) units.
simple (bool) –
wcfreq (bool) – Use the weighted center frequency.
Warning
MJD=True is currently under testing and comparisons with PSRCHIVE.
- getPeriod([header=False])¶
Returns the period of the pulsar. By default returns the Polyco-calculated period. Otherwise, returns the period as calculated by the pulsar parameter table. If a calibrator file, returns 1 divided by the header CAL_FREQ value.
- getValue(value)¶
Looks for a key in one of the headers and returns the value. First looks in the primary header, then the subintegration header, then the pulsar parameter table if it exists.
- getSubintinfo(value)¶
Looks for a key in the subintegration header, a subset of the functionality of
getValue()
- getMJD([full=False, numwrap=float])¶
- getTbin([numwrap=float])¶
Returns the time per phase bin.
- Parameters
numwrap (type) – Cast the return value into a type.
- Return type
Value given by numwrap
- getCoords([parse=True])¶
Returns the header coordinate (RA, DEC) values.
- Parameters
parse (bool) – Return each value as a tuple of floats
- Returns
RA,dec, either each as strings or tuples .
- getPulsarCoords([parse=True])¶
See
getCoords()
.
- getBandwidth([header=False])¶
Returns the observation bandwidth as the product of the channel bandwidth (subintegration header CHAN_BW) and the number of channels (subintegration header NCHAN) values.
- getDurations()¶
Return the subintegration durations array. :rtype: numpy.ndarray
Todo
Check for completeness of inputs into the durations array
- getCenterFrequency([weighted=False])¶
Returns the center frequency. If a HISTORY table is provided in the PSRFITS file, return the latest CTR_FREQ value. Otherwise, return the header OBSFREQ value.
- getFreqUnit()¶
- getScaleUnit()¶
See
getDataUnit()
- getIntensityUnit()¶
See
getDataUnit()
- getFluxDensityUnit()¶
See
getDataUnit()
- getFluxUnit()¶
See
getDataUnit()
- isCalibrator()¶
Returns if the file is a calibration observation or not, given by the OBS_MODE flag in the header.
- Return type
- record(frame)¶
Internal function that runs within state-changing functions to record those state changes to a history variable that can be written out if the archive if saved.
- Parameters
frame (frame) – Frame object returned by python’s inspect module.
- print_pypulse_history()¶
Prints all elements in the PyPulse history list.
History class¶
The History class stores the History table in the PSRFITS file. Typical users should not need to worry about using this class directly. It can be accessed in an Archive ar using ar.history (no function call).
- class History(history)¶
- Parameters
history (pyfits.hdu.table.BinTableHDU) – The binary table header data unit (HDU).
- getValue(field[, num=None])
Returns a dictionary array value.
- Parameters
field (str) – A column name (i.e., as provided by hdulist[‘HISTORY’].columns)
- Example
getValue(‘NCHAN’) will return a list of the frequency channelization history of the file.
- getLatest(field)
Returns the latest key value for a given field.
- Parameters
field (str) – A column name (i.e., as provided by hdulist[‘HISTORY’].columns)
Polyco Class¶
The Polyco class stores the Polyco table in the PSRFITS file. Typical users should not need to worry about using this class directly. It can be accessed in an Archive ar using ar.polyco (no function call).
- class Polyco(polyco[, MJD=None])¶
- Parameters
MJD (float) – A default MJD to calculate the Polyco on.
- getValue(field[, num=None])
Returns a dictionary array value.
- Parameters
field (str) – A column name (i.e., as provided by hdulist[‘POLYCO’].columns)
- getLatest(field)
Returns the latest key value for a given field.
- Parameters
field (str) – A column name (i.e., as provided by hdulist[‘POLYCO’].columns)