Archive Class

The Archive class is the primary mechanism for opening PSRFITS files.

class Archive(filename[, prepare=True, lowmem=False, verbose=True, weight=True, center_pulse=True, baseline_removal=True, wcfreq=True, thread=False, onlyheader=False])
Parameters
  • prepare (bool) – Argument passed to load(). If True, then the file will be automatically polarization averaged with pscrunch(), dedispersed with dedisperse() and using the weighted center frequency if the parameter wcfreq is set to True, and centered with center() if center_pulse is set to True.

  • lowmem (bool) – Argument passed to load(). If True, then the PSRFITS file is opened in memmap mode and the data arrays are also replaced with memmaps.

  • verbose (bool) – Print extra information on loading and processing.

  • weight (bool) – Argument passed to load(). Use the stored data weights, which is the typical mode.

  • center_pulse (bool) – Argument passed to load(). If True, then the peak of the pulse is centered in the middle of the data arrays. This is preferred for plotting purposes but the resulting arrival-time shifts are computed and stored internally.

  • baseline_removal (bool) – Argument passed to load(). Subtracts the baseline intensity of the average profile off-pulse region from all individual data profiles using remove_baseline().

  • wcfreq (bool) – Argument passed to load(). If True, then the weighted center frequency is used in dedisperse() if prepare=True.

  • thread (bool) – Argument passed to load(). If True, then the calculation of the data array will be parallelized, which can lead to some speed-up for large data files but will take longer for small data files given the extra overhead required to start the process.

  • onlyheader (bool) – Argument passed to load(). If True, then only the primary and table headers are processed, without the data array. This is much faster if you only need access to metadata.

Usage:

ar = Archive(FILENAME) #loads archive, dedispersed and polarization averaged by default
ar.tscrunch() #averages the pulse in time
data = ar.getData() #returns the numpy data array for use by you
ar.imshow() #plots frequency vs phase for the pulses

Description of Data

From Appendix A.1 of the thesis Lam 2016:

The primary data array of profiles in a PSRFITS file is given by \(\mathcal{I}(t,\mathrm{pol},\nu,\phi)\), the pulse intensity as a function of time \(t\), polarization \(\mathrm{pol}\), frequency \(\nu\), phase \(\phi\), where the arguments are in the order of the array dimensions. To save memory, intensity data are stored in multiple arrays. The raw data array (DATA) \(d\) is the largest in dimensionality but for folded pulse data is typically stored as an array of 16-bit integers. To retrieve the raw data value for each pulse profile, the data array is then multiplied by a scale array (DAT_SCL) \(s\) and an offset array (DAT_OFFS) \(o\) is added. An array of weights (DAT_WTS) \(w\) is also stored internally and typically modifies the raw data, e.g., when excising radio frequency interference. The three modifier arrays are of much smaller size than the data array and are typically stored as in 32-bit single-precision float format. Mathematically, the resultant array of pulse intensities can be written as

\[\mathcal{I}(t,\mathrm{pol},\nu,\phi) = \left[s(t,\mathrm{pol},\nu)\times d(t,\mathrm{pol},\nu,\phi)+o(t,\mathrm{pol},\nu)\right] w(t,\nu).\]

PSRFITS files also contain a wide range of additional information stored internally, including a history of all PSRCHIVE modifications to the file, a folding ephemeris, and a large global header of useful metadata. Besides the data array, PyPulse will unpack and store all extra information for retrieval via get() methods as desired.

Methods

load(filename[, prepare=True, center_pulse=True, baseline_removal=True, weight=True, wcfreq=True, onlyheader=False])

Load a PSRFITS file, process the metadata, and form the data arrays. This is called internally by __init__().

Parameters
  • filename (str) – Path to load file from.

  • prepare (bool) – This performs three tasks. It will polarization average the data via pscrunch(), dedisperse the data with dedisperse(), and rotate the pulse so that the peak is in the center of phase with center(). For centering, this will store the relevant time delays associated with the rotation.

  • center_pulse (bool) – The peak of the pulse is centered in the middle of the data arrays. This is preferred for plotting purposes but the resulting arrival-time shifts are computed and stored internally.

  • baseline_removal (bool) – Subtract the baseline intensity of the average profile off-pulse region from all individual data profiles.

  • weight (bool) – Use the stored data weights, which is the typical mode.

  • wcfreq (bool) – The weighted center frequency is used in dedisperse() if prepare=True.

  • onlyheader (bool) – Ohe primary and table headers are processed, without the data array. This is much faster if you only need access to metadata.

Returns

None

save(filename)

Save the data to a new PSRFITS file.

Parameters

filename (str) – Path to save file to.

Warning

save() will output a PSRFITS file but the output data arrays vary slightly from the input data arrays. More

unload(filename)

Same as save(). Follows PSRCHIVE convention.

gc()

Manually clear the data cube and weights for Python garbage collection

shape([squeeze=True])

Return the shape of the data array.

Parameters

squeeze (bool) – Return the shape of the data array when dimension of length 1 are removed.

Returns

shape, tuple of integers

reset([prepare=True])

Replace the data with the original clone, preventing full reloading. Useful for larger files but only if the lowmem flag is set to True.

Parameters

prepare (bool) – Argument passed to load().

scrunch([arg='Dp', **kwargs])

Average the data cube along different axes.

Parameters

arg (str) – Can be T for tscrunch(), p for pscrunch(), F for fscrunch(), B for bscrunch(), and D for dedisperse(), following the PSRCHIVE conventions.

Returns

self

tscrunch([nsubint=None, factor=None])

Perform a weighted average the data cube along the time dimension.

Parameters
  • nsubint (int) – Time average to this may subintegrations

  • factor (int) – Time average by this factor

Returns

self

pscrunch()

Perform an average the data cube along the polarization dimension. Can handle data in Coherence (AABBCRCI) or Stokes (IQUV) format.

Returns

self

Todo

Perform a weighted average of the data cube

fscrunch([nchan=None, factor=None])

Perform a weighted average the data cube along the frequency dimension

Parameters
  • nchan (int) – Frequency average to this may channels

  • factor (int) – Frequency average by this factor

Returns

self

bscrunch([nbins=None, factor=None])

Perform an average the data cube along the phase (bin) dimension.

Parameters
  • nbins (int) – Phase average to this may bins

  • factor (int) – Phase average by this factor

Returns

self

Todo

Perform a weighted average of the data cube

dedisperse([DM=None, reverse=False, wcfreq=False])

Dedisperse the pulses by introducing the appropriate time delays and rotating in phase.

Parameters
  • DM (float) – Phase average to this may bins.

  • reverse (bool) – Perform dispersion of the pulse profiles.

  • wcfreq (bool) – Use the weighted center frequency.

Returns

self

dededisperse([DM=None, wcfreq=False])

Runs dedisperse() with reverse=False flag. See that function for parameter notation.

calculateAverageProfile()

Calculate the average profile by performing an unweighted average along each dimension Automatically calls calculateOffpulseWindow().

Todo

Perform a weigthed average.

calculateOffpulseWindow()

Calculate an off-pulse window using the SinglePulse, with the windowsize parameter equal to one-eighth the number of phase bins.

center([phase_offset=0.5])

Center the peak of the pulse in the middle of the data arrays.

Parameters

phase_offset (float) – Determine the phase offset (in [0,1]) of the peak, i.e., impose an arbitrary rotation to where the center of the peak should fall.

Returns

self

removeBaseline()

Removes the baseline of the pulses given the off-pulse window of the average pulse profile pre-calculated by calculateAverageProfile() :return: self

remove_baseline()

See removeBaseline().

getLevels([differences=False])

Returns calibration levels if the Archive is a calibrator in the form of a square wave signal. If differences is set to True, then this function will return the frequencies, the amplitude differences in the height of the square wave as a function of polarization/frequency, and the associated errors. If False, then it will return the frequencies, the mean values of the low and high portions of the square wave and the associated errors.

Parameters

differences (bool) –

getPulsarCalibrator()

Uses getLevels() to get a Calibrator object with associated metadata

Return type

Calibrator

calibrate(psrcal[, fluxcal=None])

Polarization calibrates the data using another archive file. Flux calibration optional.

Parameters
  • psrcal (Archive) – Pulsar calibrator Archive.

  • fluxcal (Archive) – Flux calibrator Archive.

Warning

This function is under construction.

getData([squeeze=True, setnan=None, weight=True])

Return the data array.

Parameters
  • squeeze (bool) – All dimensions of length 1 are removed.

  • setnan (float) – Replace all np.nan with value.

  • weight (bool) – Return the data array with weights applied.

Returns

self

setData(newdata)

Replaces the data array with new data. Must be the same shape.

Parameters

newdata (numpy.ndarray) – New data array.

getWeights([squeeze=True])

Return a copy of the weights array.

Parameters

squeeze (bool) – All dimensions of length 1 are removed.

setWeights(val[, t=None, f=None])

Set weights to a certain value. Can be used for RFI-excision routines.

Parameters
  • val (float) – Value to set the weights to.

  • t (int) – Time index

  • f (int) – Frequency index

saveData([filename=None, ext='npy', ascii=False])

Save the data array to a different format. Default is to save to a numpy binary file (.npy).

Parameters
  • filename (str) – Filename to save the data to. If none, save to the archive’s original filename after replacing the extension with ext.

  • ext (str) – Filename extension

  • ascii (bool) – Save the data to to a text file. If all four dimensions have length greater than 1, the data are saved in time, polarization, frequency, and phase order, with intensity as the fifth column. Otherwise, use numpy’s savetxt() to output the array.

outputPulses(filename)

Write out a standard .npy file by calling saveData().

Parameters

filename (str) – Filename to save the data to.

getAxis([flag=None, edges=False, wcfreq=False])

Get the time or frequency axes for plotting.

Parameters
  • flag (str) – “T” for the time axis, “F” for the frequency axis.

  • edges (bool) – Do not return the centers for each subintegration/channel but rather return the edges. Better for imshow plotting because of the extents parameter.

  • wcfreq (bool) – Use the weighted center frequency.

Return type

numpy.ndarray

Todo

Let flag be both “T” and “F”.

getFrequencies()

Convenience function for getAxis('F')()

getFreqs()

See getFrequencies().

getTimes()

Convenience function for getAxis('T')()

getPulse(t[, f=None])

Get the pulse shape as a function of time and potentially frequency if provided. Assumes the shape of the data is polarization averaged.

Parameters
  • t (int) – Time index

  • f (int) – Frequency index

Return type

numpy.ndarray

Todo

Do not assume polarization averaging.

getPeakFlux(t[, f=None])

Return the maximum value of the pulses, with parameters passed to getPulse()

Parameters
  • t (int) – Time index

  • f (int) – Frequency index

Return type

float

getIntegratedFlux(t[, f=None])

Return the integrated value of the pulses, with parameters passed to getPulse()

Parameters
  • t (int) – Time index

  • f (int) – Frequency index

Return type

float

getSinglePulses([func=None, windowsize=None, **kwargs])

Efficiently wrap the data array with SinglePulse.

Parameters
  • func (function) – Arbitrary function to map onto the data array.

  • windowsize (int) – Parameter passed to SinglePulse that describes the off-pulse window length

  • **kwargs – Additional parameters passed to SinglePulse

Return type

numpy.ndarray of type np.object

fitPulses(template,[nums=[0,1,2,3,4,5,6],flatten=False,func=None,windowsize=None,**kwargs])

Fit all of the pulses with a given template shape.

Parameters
  • template (list/numpy.ndarray) – Template shape

  • nums (list/numpy.ndarray) – Numbers that denote which return values from fitPulse() from SinglePulse. Example: to return only TOA values, use nums=[1]. For TOA values and scale factors, use nums=[1,3]. Defaults to all values.

  • flatten (bool) – Flatten the data array.

  • func (function) – Arbitrary function to map onto the data array.

  • windowsize (int) – Parameter passed to SinglePulse that describes the off-pulse window length

  • **kwargs – Additional parameters passed to SinglePulse

getDynamicSpectrum([window=None, template=None, mpw=None, align=None, windowsize=None, verbose=False, snr=False, maketemplate=True])

Return the dynamic spectrum.

Parameters
  • window (numpy.ndarray) – Return the dynamic spectrum using only certain phase bins.

  • template (list/numpy.ndarray) – Generate the dynamic spectrum using the scale factor from template matching. Otherwise simply sum along the phase axis.

  • mpw (list/numpy.ndarray) – Main-pulse window if calculating the dynamic spectrum using a template. Required if a template is provided.

  • align (float) – Parameter passed to SinglePulse that describe a rotation of the pulse.

  • windowsize (int) – Parameter passed to SinglePulse that describes the off-pulse window length

  • verbose (bool) – Print the time index as each template is fit.

  • snr (bool) – Instead of the scale factors, return the signal-to-noise ratios.

  • maketemplate (bool) – Instead of supplying a template, make a basic smoothed one from the average pulse for matched filtering.

Warning

return values are not well-defined. Can either return the dynamic spectra, or will return a tuple of the scale factors, offsets, and errors of the template fit.

plot([ax=None, show=True])

Basic plotter of the data, if the data array can be reduced to one dimension.

Parameters
  • ax (matplotlib.axes._subplots.AxesSubplot) – Provide a matplotlib axis to plot to.

  • show (bool) – Generate a matplotlib plot display.

imshow([ax=None, cbar=False, mask=None, show=True, **kwargs])

Basic plotter of the data, if the data array can be reduced to two dimensions. The origin is set to the lower left.

Parameters
  • ax (matplotlib.axes._subplots.AxesSubplot) – Provide a matplotlib axis to plot to.

  • cbar (bool) – Include a matplotlib colorbar.

  • mask (numpy.ndarray) – Apply a mask array using the conventions of a numpy masked array (numpy.ma.core.MaskedArray)

  • show (bool) – Generate a matplotlib plot display.

  • **kwargs – Additional arguments to pass to imshow.

pavplot([ax=None, mode='GTpd', show=True, wcfreq=True])

Produces a PSRCHIVE pav-like plot for comparison

Parameters

ax (matplotlib.axes._subplots.AxesSubplot) – Provide a matplotlib axis to plot to.

waterfall([offset=None, border=0, labels=True, album=False, bins=None, show=True])

Produce a waterfall plot if the data array can be reduced to two dimensions.

Parameters
  • offset (float) – Y offset of the data

  • border (float) – Fractional border around pulses.

  • labels (bool) – Plot tick labels.

  • album (bool) – Plot white on black background instead of black on white background.

  • bins (numpy.ndarray) – Selection of phase bins to plot

joyDivision([border=0.1, labels=False, album=True, **kwargs])

Calls waterfall() in the style of the Joy Division album cover. All parameters are passed to the function.

time(template, filename[, MJD=False, wcfreq=False, **kwargs])

Calculate times-of-arrival (TOAs).

Parameters
  • template (list/numpy.ndarray/Archive) – Template shape to fit to the pulses.

  • filename (str) – Path to save text to. If filename=None, print the text.

  • MJD (bool) – Calculate absolute TOAs in MJD units instead of relative TOAs in bin (time) units.

  • simple (bool) –

  • wcfreq (bool) – Use the weighted center frequency.

Warning

MJD=True is currently under testing and comparisons with PSRCHIVE.

getNsubint()

Returns the current number of subintegrations.

Return type

int

getNpol()

Returns the current number of polarization states.

Return type

int

getNchan()

Returns the current number of frequency channels.

Return type

int

getNbin()

Returns the current number of phase bins.

Return type

int

getPeriod([header=False])

Returns the period of the pulsar. By default returns the Polyco-calculated period. Otherwise, returns the period as calculated by the pulsar parameter table. If a calibrator file, returns 1 divided by the header CAL_FREQ value.

Parameters

header (bool) – Enforce a return of the pulsar parameter table value.

Return type

float

getValue(value)

Looks for a key in one of the headers and returns the value. First looks in the primary header, then the subintegration header, then the pulsar parameter table if it exists.

Parameters

value (str) – Value to look for.

Return type

str

getSubintinfo(value)

Looks for a key in the subintegration header, a subset of the functionality of getValue()

Parameters

value (str) – Value to look for.

Return type

str

getName()

Returns the header SRC_NAME value.

Return type

str

getMJD([full=False, numwrap=float])
getTbin([numwrap=float])

Returns the time per phase bin.

Parameters

numwrap (type) – Cast the return value into a type.

Return type

Value given by numwrap

getDM()

Returns the subintegration header DM value.

Return type

float

getRM()

Returns the subintegration header RM value.

Return type

float

getCoords([parse=True])

Returns the header coordinate (RA, DEC) values.

Parameters

parse (bool) – Return each value as a tuple of floats

Returns

RA,dec, either each as strings or tuples .

getPulsarCoords([parse=True])

See getCoords().

getTelescopeCoords()

Returns the header ANT_X, ANT_Y, ANTZ values.

Return type

tuple

getBandwidth([header=False])

Returns the observation bandwidth as the product of the channel bandwidth (subintegration header CHAN_BW) and the number of channels (subintegration header NCHAN) values.

Parameters

header (bool) – Returns the header OBSBW value

Return type

float

getDuration()

Returns the sum of the subintegration header TSUBINT values.

Return type

float

getDurations()

Return the subintegration durations array. :rtype: numpy.ndarray

Todo

Check for completeness of inputs into the durations array

getCenterFrequency([weighted=False])

Returns the center frequency. If a HISTORY table is provided in the PSRFITS file, return the latest CTR_FREQ value. Otherwise, return the header OBSFREQ value.

Parameters

weighted (bool) – Return the center frequency weighted by the weights array \((\sum_i w_i \nu_i / \sum w_i\) for frequency \(i)\).

Return type

float

getFrequencyUnit()

Returns the unit associated with the frequency axis.

Return type

str

getFreqUnit()

See getFrequencyUnit()

getTimeUnit()

Returns the unit associated with the time axis.

Return type

str

getDataUnit()

Returns the unit associated with the data.

Return type

str

getScaleUnit()

See getDataUnit()

getIntensityUnit()

See getDataUnit()

getFluxDensityUnit()

See getDataUnit()

getFluxUnit()

See getDataUnit()

getTelescope()

Returns the header TELESCOP value.

Return type

str

getFrontend()

Returns the header FRONTEND value.

Return type

str

getBackend()

Returns the header BACKEND value.

Return type

str

getSN()

Returns the average pulse signal-to-noise ratio.

Return type

float

isCalibrator()

Returns if the file is a calibration observation or not, given by the OBS_MODE flag in the header.

Return type

bool

record(frame)

Internal function that runs within state-changing functions to record those state changes to a history variable that can be written out if the archive if saved.

Parameters

frame (frame) – Frame object returned by python’s inspect module.

Prints all elements in the PyPulse history list.

History class

The History class stores the History table in the PSRFITS file. Typical users should not need to worry about using this class directly. It can be accessed in an Archive ar using ar.history (no function call).

class History(history)
Parameters

history (pyfits.hdu.table.BinTableHDU) – The binary table header data unit (HDU).

getValue(field[, num=None])

Returns a dictionary array value.

Parameters

field (str) – A column name (i.e., as provided by hdulist[‘HISTORY’].columns)

Example

getValue(‘NCHAN’) will return a list of the frequency channelization history of the file.

getLatest(field)

Returns the latest key value for a given field.

Parameters

field (str) – A column name (i.e., as provided by hdulist[‘HISTORY’].columns)

printEntry(i)

Prints the i-th history entry.

Parameters

i (int) – Index of entry to print.

Polyco Class

The Polyco class stores the Polyco table in the PSRFITS file. Typical users should not need to worry about using this class directly. It can be accessed in an Archive ar using ar.polyco (no function call).

class Polyco(polyco[, MJD=None])
Parameters

MJD (float) – A default MJD to calculate the Polyco on.

getValue(field[, num=None])

Returns a dictionary array value.

Parameters

field (str) – A column name (i.e., as provided by hdulist[‘POLYCO’].columns)

getLatest(field)

Returns the latest key value for a given field.

Parameters

field (str) – A column name (i.e., as provided by hdulist[‘POLYCO’].columns)

calculate([MJD=None])

Calculates the phase and frequency at a given MJD.

Parameters

MJD (float) – MJD to calculate the Polyco on. If not provided, then the default MJD must be set in the constructor.

Returns

phase (float), frequency (float)

calculatePeriod([MJD=None])

Calculates the pulse period at a given MJD.

Parameters

MJD (float) – MJD to calculate the Polyco on. If not provided, then the default MJD must be set in the constructor.

Returns

period (float)