Archive Class

The Archive class is the primary mechanism for opening PSRFITS files.

class Archive(filename[, prepare=True, lowmem=False, verbose=True, weight=True, center_pulse=True, baseline_removal=True, wcfreq=True, thread=False, onlyheader=False])
Parameters:
  • prepare (bool) – Argument passed to load(). If True, then the file will be automatically polarization averaged with pscrunch(), dedispersed with dedisperse() and using the weighted center frequency if the parameter wcfreq is set to True, and centered with center() if center_pulse is set to True.
  • lowmem (bool) – Argument passed to load(). If True, then the PSRFITS file is opened in memmap mode and the data arrays are also replaced with memmaps.
  • verbose (bool) – Print extra information on loading and processing.
  • weight (bool) – Argument passed to load(). Use the stored data weights, which is the typical mode.
  • center_pulse (bool) – Argument passed to load(). If True, then the peak of the pulse is centered in the middle of the data arrays. This is preferred for plotting purposes but the resulting arrival-time shifts are computed and stored internally.
  • baseline_removal (bool) – Argument passed to load(). Subtracts the baseline intensity of the average profile off-pulse region from all individual data profiles using remove_baseline().
  • wcfreq (bool) – Argument passed to load(). If True, then the weighted center frequency is used in dedisperse() if prepare=True.
  • thread (bool) – Argument passed to load(). If True, then the calculation of the data array will be parallelized, which can lead to some speed-up for large data files but will take longer for small data files given the extra overhead required to start the process.
  • onlyheader (bool) – Argument passed to load(). If True, then only the primary and table headers are processed, without the data array. This is much faster if you only need access to metadata.

Usage:

ar = Archive(FILENAME) #loads archive, dedispersed and polarization averaged by default
ar.tscrunch() #averages the pulse in time
data = ar.getData() #returns the numpy data array for use by you
ar.imshow() #plots frequency vs phase for the pulses

Description of Data

From Appendix A.1 of the thesis Lam 2016:

The primary data array of profiles in a PSRFITS file is given by \(\mathcal{I}(t,\mathrm{pol},\nu,\phi)\), the pulse intensity as a function of time \(t\), polarization \(\mathrm{pol}\), frequency \(\nu\), phase \(\phi\), where the arguments are in the order of the array dimensions. To save memory, intensity data are stored in multiple arrays. The raw data array (DATA) \(d\) is the largest in dimensionality but for folded pulse data is typically stored as an array of 16-bit integers. To retrieve the raw data value for each pulse profile, the data array is then multiplied by a scale array (DAT_SCL) \(s\) and an offset array (DAT_OFFS) \(o\) is added. An array of weights (DAT_WTS) \(w\) is also stored internally and typically modifies the raw data, e.g., when excising radio frequency interference. The three modifier arrays are of much smaller size than the data array and are typically stored as in 32-bit single-precision float format. Mathematically, the resultant array of pulse intensities can be written as

\[\mathcal{I}(t,\mathrm{pol},\nu,\phi) = \left[s(t,\mathrm{pol},\nu)\times d(t,\mathrm{pol},\nu,\phi)+o(t,\mathrm{pol},\nu)\right] w(t,\nu).\]

PSRFITS files also contain a wide range of additional information stored internally, including a history of all PSRCHIVE modifications to the file, a folding ephemeris, and a large global header of useful metadata. Besides the data array, PyPulse will unpack and store all extra information for retrieval via get() methods as desired.

Methods

load(filename[, prepare=True, center_pulse=True, baseline_removal=True, weight=True, wcfreq=True, onlyheader=False])

Load a PSRFITS file, process the metadata, and form the data arrays. This is called internally by __init__().

Parameters:
  • filename (str) – Path to load file from.
  • prepare (bool) – This performs three tasks. It will polarization average the data via pscrunch(), dedisperse the data with dedisperse(), and rotate the pulse so that the peak is in the center of phase with center(). For centering, this will store the relevant time delays associated with the rotation.
  • center_pulse (bool) – The peak of the pulse is centered in the middle of the data arrays. This is preferred for plotting purposes but the resulting arrival-time shifts are computed and stored internally.
  • baseline_removal (bool) – Subtract the baseline intensity of the average profile off-pulse region from all individual data profiles.
  • weight (bool) – Use the stored data weights, which is the typical mode.
  • wcfreq (bool) – The weighted center frequency is used in dedisperse() if prepare=True.
  • onlyheader (bool) – Ohe primary and table headers are processed, without the data array. This is much faster if you only need access to metadata.
Returns:

None

save(filename)

Save the data to a new PSRFITS file.

Parameters:filename (str) – Path to save file to.

Warning

save() will output a PSRFITS file but the output data arrays vary slightly from the input data arrays. More

unload(filename)

Same as save(). Follows PSRCHIVE convention.

gc()

Manually clear the data cube and weights for Python garbage collection

shape([squeeze=True])

Return the shape of the data array.

Parameters:squeeze (bool) – Return the shape of the data array when dimension of length 1 are removed.
Returns:shape, tuple of integers
reset([prepare=True])

Replace the data with the original clone, preventing full reloading. Useful for larger files but only if the lowmem flag is set to True.

Parameters:prepare (bool) – Argument passed to load().
scrunch([arg='Dp', **kwargs])

Average the data cube along different axes.

Parameters:arg (str) – Can be T for tscrunch(), p for pscrunch(), F for fscrunch(), B for bscrunch(), and D for dedisperse(), following the PSRCHIVE conventions.
Returns:self
tscrunch([nsubint=None, factor=None])

Perform a weighted average the data cube along the time dimension.

Parameters:
  • nsubint (int) – Time average to this may subintegrations
  • factor (int) – Time average by this factor
Returns:

self

pscrunch()

Perform an average the data cube along the polarization dimension. Can handle data in Coherence (AABBCRCI) or Stokes (IQUV) format.

Returns:self

Todo

Perform a weighted average of the data cube

fscrunch([nchan=None, factor=None])

Perform a weighted average the data cube along the frequency dimension

Parameters:
  • nsubint (int) – Frequency average to this may channels
  • factor (int) – Frequency average by this factor
Returns:

self

bscrunch([nbins=None, factor=None])

Perform an average the data cube along the phase (bin) dimension.

Parameters:
  • nsubint (int) – Phase average to this may bins
  • factor (int) – Phase average by this factor
Returns:

self

Todo

Perform a weighted average of the data cube

dedisperse([DM=None, reverse=False, wcfreq=False])

Dedisperse the pulses by introducing the appropriate time delays and rotating in phase.

Parameters:
  • DM (float) – Phase average to this may bins.
  • reverse (bool) – Perform dispersion of the pulse profiles.
  • wcfreq (bool) – Use the weighted center frequency.
Returns:

self

dededisperse([DM=None, wcfreq=False])

Runs dedisperse() with reverse=False flag. See that function for parameter notation.

calculateAverageProfile()

Calculate the average profile by performing an unweighted average along each dimension Automatically calls calculateOffpulseWindow().

Todo

Perform a weigthed average.

calculateOffpulseWindow()

Calculate an off-pulse window using the SinglePulse, with the windowsize parameter equal to one-eighth the number of phase bins.

center([phase_offset=0.5])

Center the peak of the pulse in the middle of the data arrays.

Parameters:phase_offset (float) – Determine the phase offset (in [0,1]) of the peak, i.e., impose an arbitrary rotation to where the center of the peak should fall.
Returns:self
removeBaseline()

Removes the baseline of the pulses given the off-pulse window of the average pulse profile pre-calculated by calculateAverageProfile() :return: self

remove_baseline()

See removeBaseline().

getLevels([differences=False])

Returns calibration levels if the Archive is a calibrator in the form of a square wave signal. If differences is set to True, then this function will return the frequencies, the amplitude differences in the height of the square wave as a function of polarization/frequency, and the associated errors. If False, then it will return the frequencies, the mean values of the low and high portions of the square wave and the associated errors.

Parameters:differences (bool) –
getPulsarCalibrator()

Uses getLevels() to get a Calibrator object with associated metadata

Return type:Calibrator
calibrate(psrcal[, fluxcal=None])

Polarization calibrates the data using another archive file. Flux calibration optional.

Parameters:
  • psrcal (Archive) – Pulsar calibrator Archive.
  • fluxcal (Archive) – Flux calibrator Archive.

Warning

This function is under construction.

getData([squeeze=True, setnan=None, weight=True])

Return the data array.

Parameters:
  • squeeze (bool) – All dimensions of length 1 are removed.
  • setnan (float) – Replace all np.nan with value.
  • weight (bool) – Return the data array with weights applied.
Returns:

self

setData(newdata)

Replaces the data array with new data. Must be the same shape.

Parameters:newdata (numpy.ndarray) – New data array.
getWeights([squeeze=True])

Return a copy of the weights array.

Parameters:squeeze (bool) – All dimensions of length 1 are removed.
setWeights(val[, t=None, f=None])

Set weights to a certain value. Can be used for RFI-excision routines.

Parameters:
  • val (float) – Value to set the weights to.
  • t (int) – Time index
  • f (int) – Frequency index
saveData([filename=None, ext='npy', ascii=False])

Save the data array to a different format. Default is to save to a numpy binary file (.npy).

Parameters:
  • filename (str) – Filename to save the data to. If none, save to the archive’s original filename after replacing the extension with ext.
  • ext (str) – Filename extension
  • ascii (bool) – Save the data to to a text file. If all four dimensions have length greater than 1, the data are saved in time, polarization, frequency, and phase order, with intensity as the fifth column. Otherwise, use numpy’s savetxt() to output the array.
outputPulses(filename)

Write out a standard .npy file by calling saveData().

Parameters:filename (str) – Filename to save the data to.
getAxis([flag=None, edges=False, wcfreq=False])

Get the time or frequency axes for plotting.

Parameters:
  • flag (str) – “T” for the time axis, “F” for the frequency axis.
  • edges (bool) – Do not return the centers for each subintegration/channel but rather return the edges. Better for imshow plotting because of the extents parameter.
  • wcfreq (bool) – Use the weighted center frequency.
Return type:

numpy.ndarray

Todo

Let flag be both “T” and “F”.

getFrequencies()

Convenience function for getAxis('F')()

getFreqs()

See getFrequencies().

getTimes()

Convenience function for getAxis('T')()

getPulse(t[, f=None])

Get the pulse shape as a function of time and potentially frequency if provided. Assumes the shape of the data is polarization averaged.

Parameters:
  • t (int) – Time index
  • f (int) – Frequency index
Return type:

numpy.ndarray

Todo

Do not assume polarization averaging.

getPeakFlux(t[, f=None])

Return the maximum value of the pulses, with parameters passed to getPulse()

Parameters:
  • t (int) – Time index
  • f (int) – Frequency index
Return type:

float

getIntegratedFlux(t[, f=None])

Return the integrated value of the pulses, with parameters passed to getPulse()

Parameters:
  • t (int) – Time index
  • f (int) – Frequency index
Return type:

float

getSinglePulses([func=None, windowsize=None, **kwargs])

Efficiently wrap the data array with SinglePulse.

Parameters:
  • func (function) – Arbitrary function to map onto the data array.
  • windowsize (int) – Parameter passed to SinglePulse that describes the off-pulse window length
  • **kwargs – Additional parameters passed to SinglePulse
Return type:

numpy.ndarray of type np.object

fitPulses(template, nums[, flatten=False, func=None, windowsize=None, **kwargs])

Fit all of the pulses with a given template shape.

Parameters:
  • template (list/numpy.ndarray) – Template shape
  • nums (list/numpy.ndarray) – Numbers that denote which return values from fitPulse() from SinglePulse. Example: to return only TOA values, use nums=[1]. For TOA values and scale factors, use nums=[1,3].
  • flatten (bool) – Flatten the data array.
  • func (function) – Arbitrary function to map onto the data array.
  • windowsize (int) – Parameter passed to SinglePulse that describes the off-pulse window length
  • **kwargs – Additional parameters passed to SinglePulse
getDynamicSpectrum([window=None, template=None, mpw=None, align=None, windowsize=None, verbose=False, snr=False, maketemplate=True])

Return the dynamic spectrum.

Parameters:
  • window (numpy.ndarray) – Return the dynamic spectrum using only certain phase bins.
  • template (list/numpy.ndarray) – Generate the dynamic spectrum using the scale factor from template matching. Otherwise simply sum along the phase axis.
  • mpw (list/numpy.ndarray) – Main-pulse window if calculating the dynamic spectrum using a template. Required if a template is provided.
  • align (float) – Parameter passed to SinglePulse that describe a rotation of the pulse.
  • windowsize (int) – Parameter passed to SinglePulse that describes the off-pulse window length
  • verbose (bool) – Print the time index as each template is fit.
  • snr (bool) – Instead of the scale factors, return the signal-to-noise ratios.
  • maketemplate (bool) – Instead of supplying a template, make a basic smoothed one from the average pulse for matched filtering.

Warning

return values are not well-defined. Can either return the dynamic spectra, or will return a tuple of the scale factors, offsets, and errors of the template fit.

plot([ax=None, show=True])

Basic plotter of the data, if the data array can be reduced to one dimension.

Parameters:
  • ax (matplotlib.axes._subplots.AxesSubplot) – Provide a matplotlib axis to plot to.
  • show (bool) – Generate a matplotlib plot display.
imshow([ax=None, cbar=False, mask=None, show=True, **kwargs])

Basic plotter of the data, if the data array can be reduced to two dimensions. The origin is set to the lower left.

Parameters:
  • ax (matplotlib.axes._subplots.AxesSubplot) – Provide a matplotlib axis to plot to.
  • cbar (bool) – Include a matplotlib colorbar.
  • mask (numpy.ndarray) – Apply a mask array using the conventions of a numpy masked array (numpy.ma.core.MaskedArray)
  • show (bool) – Generate a matplotlib plot display.
  • **kwargs – Additional arguments to pass to imshow.
pavplot([ax=None, mode='GTpd', show=True, wcfreq=True])

Produces a PSRCHIVE pav-like plot for comparison

Parameters:ax (matplotlib.axes._subplots.AxesSubplot) – Provide a matplotlib axis to plot to.
waterfall([offset=None, border=0, labels=True, album=False, bins=None, show=True])

Produce a waterfall plot if the data array can be reduced to two dimensions.

Parameters:
  • offset (float) – Y offset of the data
  • border (float) – Fractional border around pulses.
  • labels (bool) – Plot tick labels.
  • album (bool) – Plot white on black background instead of black on white background.
  • bins (numpy.ndarray) – Selection of phase bins to plot
joyDivision([border=0.1, labels=False, album=True, **kwargs])

Calls waterfall() in the style of the Joy Division album cover. All parameters are passed to the function.

time(template, filename[, MJD=False, wcfreq=False, **kwargs])

Calculate times-of-arrival (TOAs).

Parameters:
  • template (list/numpy.ndarray/Archive) – Template shape to fit to the pulses.
  • filename (str) – Path to save text to. If filename=None, print the text.
  • MJD (bool) – Calculate absolute TOAs in MJD units instead of relative TOAs in bin (time) units.
  • simple (bool) –
  • wcfreq (bool) – Use the weighted center frequency.

Warning

MJD=True is currently under testing and comparisons with PSRCHIVE.

getNsubint()

Returns the current number of subintegrations.

Return type:int
getNpol()

Returns the current number of polarization states.

Return type:int
getNchan()

Returns the current number of frequency channels.

Return type:int
getNbin()

Returns the current number of phase bins.

Return type:int
getPeriod([header=False])

Returns the period of the pulsar. By default returns the Polyco-calculated period. Otherwise, returns the period as calculated by the pulsar parameter table. If a calibrator file, returns 1 divided by the header CAL_FREQ value.

Parameters:header (bool) – Enforce a return of the pulsar parameter table value.
Return type:float
getValue(value)

Looks for a key in one of the headers and returns the value. First looks in the primary header, then the subintegration header, then the pulsar parameter table if it exists.

Parameters:value (str) – Value to look for.
Return type:str
getSubintinfo(value)

Looks for a key in the subintegration header, a subset of the functionality of getValue()

Parameters:value (str) – Value to look for.
Return type:str
getName()

Returns the header SRC_NAME value.

Return type:str
getMJD([full=False, numwrap=float])
getTbin([numwrap=float])

Returns the time per phase bin.

Parameters:numwrap (type) – Cast the return value into a type.
Return type:Value given by numwrap
getDM()

Returns the subintegration header DM value.

Return type:float
getRM()

Returns the subintegration header RM value.

Return type:float
getCoords([parse=True])

Returns the header coordinate (RA, DEC) values.

Parameters:parse (bool) – Return each value as a tuple of floats
Returns:RA,dec, either each as strings or tuples .
getPulsarCoords([parse=True])

See getCoords().

getTelescopeCoords()

Returns the header ANT_X, ANT_Y, ANTZ values.

Return type:tuple
getBandwidth([header=False])

Returns the observation bandwidth as the product of the channel bandwidth (subintegration header CHAN_BW) and the number of channels (subintegration header NCHAN) values.

Parameters:header (bool) – Returns the header OBSBW value
Return type:float
getDuration()

Returns the sum of the subintegration header TSUBINT values.

Return type:float
getDurations()

Return the subintegration durations array. :rtype: numpy.ndarray

Todo

Check for completeness of inputs into the durations array

getCenterFrequency([weighted=False])

Returns the center frequency. If a HISTORY table is provided in the PSRFITS file, return the latest CTR_FREQ value. Otherwise, return the header OBSFREQ value.

Parameters:weighted (bool) – Return the center frequency weighted by the weights array \((\sum_i w_i \nu_i / \sum w_i\) for frequency \(i)\).
Return type:float
getFrequencyUnit()

Returns the unit associated with the frequency axis.

Return type:str
getFreqUnit()

See getFrequencyUnit()

getTimeUnit()

Returns the unit associated with the time axis.

Return type:str
getDataUnit()

Returns the unit associated with the data.

Return type:str
getScaleUnit()

See getDataUnit()

getIntensityUnit()

See getDataUnit()

getFluxDensityUnit()

See getDataUnit()

getFluxUnit()

See getDataUnit()

getTelescope()

Returns the header TELESCOP value.

Return type:str
getFrontend()

Returns the header FRONTEND value.

Return type:str
getBackend()

Returns the header BACKEND value.

Return type:str
getSN()

Returns the average pulse signal-to-noise ratio.

Return type:float
isCalibrator()

Returns if the file is a calibration observation or not, given by the OBS_MODE flag in the header.

Return type:bool
record(frame)

Internal function that runs within state-changing functions to record those state changes to a history variable that can be written out if the archive if saved.

Parameters:frame (frame) – Frame object returned by python’s inspect module.

Prints all elements in the PyPulse history list.

History class

The History class stores the History table in the PSRFITS file. Typical users should not need to worry about using this class directly. It can be accessed in an Archive ar using ar.history (no function call).

class History(history)
Parameters:history (pyfits.hdu.table.BinTableHDU) – The binary table header data unit (HDU).
getValue(field[, num=None])

Returns a dictionary array value.

Parameters:field (str) – A column name (i.e. as provided by hdulist[‘HISTORY’].columns)
Example:getValue(‘NCHAN’) will return a list of the frequency channelization history of the file.
getLatest(field)

Returns the latest key value for a given field.

Parameters:field (str) – A column name, see getValue()
printEntry(i)

Prints the i-th history entry.

Parameters:i (int) – Index of entry to print.

Polyco Class

The Polyco class stores the Polyco table in the PSRFITS file. Typical users should not need to worry about using this class directly. It can be accessed in an Archive ar using ar.polyco (no function call).

class Polyco(polyco[, MJD=None])
Parameters:MJD (float) – A default MJD to calculate the Polyco on.
getValue(field[, num=None])

Returns a dictionary array value.

Parameters:field (str) – A column name (i.e. as provided by hdulist[‘POLYCO’].columns)
getLatest(field)

Returns the latest key value for a given field.

Parameters:field (str) – A column name, see getValue()
calculate([MJD=None])

Calculates the phase and frequency at a given MJD.

Parameters:MJD (float) – MJD to calculate the Polyco on. If not provided, then the default MJD must be set in the constructor.
Returns:phase (float), frequency (float)
calculatePeriod([MJD=None])

Calculates the pulse period at a given MJD.

Parameters:MJD (float) – MJD to calculate the Polyco on. If not provided, then the default MJD must be set in the constructor.
Returns:period (float)