Python

Basics

how to get information about a function?
Use the help function, e.g.,
import nump as np
help(np.random.chisquare)
which version of matplotlib am i using?
import matplotlib
matplotlib.__version__
comment out code:
# this is commented out
'''
this whole section of code is commented out
'''
How I do I debug? Is there a command similar to the keyboard command in matlab? The analogous command is pdb.set_trace().
import pdb
# ... various code here ...
# python stops when it gets to the next line
pdb.set_trace()

Useful tricks, gotchas, beginner's traps

Set default values for function parameters.
In this example, the function window will use default values of alpha=0.10 and type='tukey' unless otherwise specified.
def window(ht, alpha=0.10, type='tukey'):
When a function has defaults, you can call this function supplying only the variables you want, e.g.,
window(ht, type='hann')
Sometimes, it is useful to specify all of the parameter names to remember what goes where. For example,
window(ht, alpha=0.25, type='tukey')
might be useful if
window(ht, 0.25, type='tukey')
will leave a user scratching their head asking: "What is 0.25?"

ignore some outputs returned by a function.
Use an underscore in place of output that you do not want:
f22w_low, f22w_high , _, _, _ = harvest_nest('chains_GR+/h22_Lev6_CoM', N)
Copying versus reference.
Often, you want to create a copy of some array. Naively, you might think that you can do so like this
new_varb = old_varb
However, this does not create a copy of old_varb. It creates a second reference to the same object! There are various ways to avoid this, for example,
new_varb = 1.0*old_varb
The difference between 1 and 1.0.
Python treats 1 and 1.0 differently. The former is an integer while the latter is a float. Thus,
1/8 = 0
1./8 = 0.125
How do I find the path to python source code?
import inspect
import ezgw.signals as signals
inspect.getfile(signals)
equivalent of matlab's tic/toc function:
from time import time as Time
tstart = Time()
suppress warnings from importing a module:
This is useful if you get a million error messages when importing pandas.
import warnings
with warnings.catch_warnings():
    warnings.filterwarnings("ignore")
    from pandas import read_csv
Suppress all warnings: python -W ignore test_window_v6.py

Vectorized Look-Up Table
In this example, we have a 4D look-up table called table. We want to pass it values for (gamma, thetaj, k, iota) and get the look-up value with a vectorized calculation.
ang = []
ind_gamma = np.round((gamma-gamma_min)/gamma_delta).astype(int)
ind_thetaj = np.round((thetaj-thetaj_min)/thetaj_delta).astype(int)
ind_k = np.round((k-k_min)/k_delta).astype(int)
ind_iota = np.round((iota-iota_min)/iota_delta).astype(int)
ang.append(table[[ind_gamma],[ind_thetaj],[ind_k],[ind_iota]])
E_iso = Etot*np.array(ang)*0.5
What is the python equivalent of matlab's try-catch syntax?
# calculate E_iso                                                       
try:
    E_iso = cal_E_iso(logE0, Gamma, thetaj, k, iota)
except:
    pdb.set_trace()

Logic

if, else, and, or:
if type=='hann':
    wt = scisig.hann(L)
elif type=='tukey':
    wt = scisig.tukey(L, alpha, sym=True)
elif type=='rect' or type=='rect_test'
    wt = np.ones(ht.shape)
else:
    raise Exception('undefined window type')
for loops:
for x in range(0, nsamples):
  print "%1.1f" % (m1[x])
You can also make a for loop with enumerate.
tt0 = np.arange(0.001, 0.035, 0.0001)
for ii,t0 in enumerate(tt0):
  rho_res[ii] = tools.snr_exp(resf, fnew, PSD)
Define a cut for elements in an array:
# notice that the parantheses are needed here to separate the inequalities
cut = (epsilon>Emin) & (E0>Emin)


Math

complex numbers.
In numpy, complex numbers are indicated by adding a j, which represents the sqrt(-1). For example, here is the square of a complex number:
(1 + 3j)**2
Be careful because, some operations return nan if you do not include a j to tell numpy you want a complex answer:
np.sqrt(-1)=nan
np.sqrt(-1+0j)=1j
generate a random number drawn from a normal distribution: np.random.randn()

set the random seed np.random.seed(150914)


Input/Output

Printing results to the screen: print "%i samples with %i parameters" % (nsamples, nparams)

Printing number with significant digits:
print "%.2f" % x         #print two significant digits
print "%.3e" % x         #print three significant digits in sci notation
How do I generate a custom error message?
if len(hf1_m1) != len(f):
    raise Exception('undefined window type')
print without a return carriage: print "stuff",

print elements in an array separated by a space:
print " ".join([str(x) for x in my_array[1:5]] )
concatenation two strings
out_dir = './surrogate_data/'
file_name = out_dir + out_file_name
save an array to an asci file
np.savetxt(file_name, np.c_[times, hp_zero, hc_zero], fmt='%.6e')
load an array from an asci file
from astropy.io import ascii
data_table = ascii.read("sim.out")
data = data_table['col1']
load an array of complex-valued data from an asci file
data_table3=np.loadtxt("tidal_40Mpc_Lambda500.dat", dtype=complex)
if a directory path does not exist, create it:
if not os.path.exists(basedir):
   os.makedirs(basedir)
Reading csv files:
from pandas import read_csv
...
table = read_csv('./param_table2_gamma.txt', header=None, delim_whitespace=True)
Reading json files: JSON (JavaScript Object Notation) is a data format. It is used by PyMultiNest.
with open(stats_file) as f:
    stats=json.load(f)
    Z = stats["global evidence"]
    f_best,tau_best = stats['modes'][0]['maximum']
Add an argument parser to your python script.
# usage: python call_tupak.py 1 -i ../5Gpc_injections.hdf5 -p ../prior_files/5Gpc.prior -d True -t True -f True
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("idx", type=int,
                    help="index (aka job number); doubles as injection row")
parser.add_argument("-i", "--injection_file",
                    default='../5Gpc_injections.hdf5',
                    help="hdf5 injection file; see generate_injections.py")
parser.add_argument("-p", "--prior_file",
                    default='../prior_files/5Gpc.prior', help="prior file")
parser.add_argument("-d", "--dist_marg", default=True,
                    help="distance marginalization flag")
parser.add_argument("-t", "--time_marg", default=True,
                    help="time marginalization flag")
parser.add_argument("-f", "--phase_marg", default=True,
                    help="phase marginalization flag")
args = parser.parse_args()
idx=args.idx
injection_file=args.injection_file
prior_file=args.prior_file
dist=args.dist_marg
time=args.time_marg
phase=args.phase_marg

Arrays, lists, dictionaries

Define an array: z = [1, 0, 3]

selecting a row from an array: eta = dat[:,28]
Remember that python counts from zero.

size of an array: nsamples = dat.shape[0]+1

length of array: len(f)

last element in an array: last_val = hf1_m1[-1]

exclude all but last element in an array: tmp = hf1_m1[:-1]

an array of zeros: np.zeros(padlen)

concatenate arrays: hmem = np.concatenate((hmem, np.zeros(padlen)))

concatenate arrays (method two): Sh_tmp = np.transpose(np.vstack((f[band], P[:,z])))

selecting some subset of an array using a matlab-like cut: hmemf = hmemf[ff>fmin]

selecting the first N elements in an array: hmemf[1:5]

How do I store a collection of disparate information in a single python object? Is there something like matlab structs?
The python object that handles this is called a dictionary.
params = {}
params['m1']=29
params['m2']=30
Find the index of the maximum values along an axis.
idx = np.argmax(x)
create an array of evenly-spaced elements (up to but not including fmax): f = np.arange(fmin, fmax, deltaF)
create an array of evenly-spaced elements (up to and including fmax):
f = deltaF*np.linspace(1, numFreqs, numFreqs)
create an array of N evenly-spaced elements (up to and including fmax):
N = np.round(dur*fs)
t = (1./fs) * np.linspace(0, N, N)
Repmat.
Use repmat to repeat copies of an array along a new dimension.
import numpy.matlib
Sh0_array = np.transpose(np.matlib.repmat(Sh0[:,1], nmarge, 1))
flatten
If you get a numpy array that has dimensions of, say, [1, 1, 91], you may want to flatten it to [,91], for example, if you want to plot it.
E_iso = np.ndarray.flatten(np.isnan(cal_E_iso(1, 300, 2, 0, iota)))

Modules

Overview
A lot of useful python functions are shared using packages called modules. For example, all but the most trivial math operations are carried out using the python module, numpy. If you type
sqrt(3)
into python, you will get an error. The necessary syntax is
import numpy as np
np.sqrt(3)
Here are the modules I use most:
  • numpy: math
  • scipy: interpolation, signal processing, other things
  • pdb: debugging
  • matplotlib: plotting, some signal processing such as psd
    In order to install these modules, you can use pip, e.g.,
    pip install --user scipy
    
    The --user flag is necessary if you do not have admin privelages. In addition to these standard modules, I use
  • ezgw: easy tools for gravitational-wave data analysis
  • tupak: parameter estimation for the masses
  • pymultinest: nested sampling
  • h5py: hdf5
  • gwsurrogate

    Importing part of a module. It is often useful to import just some part of a module or to abbreviate a module name. Here is an example for how to do this with common conventions:
    import numpy as np
    import matplotlib.pyplot as plt
    import pdb
    import sys
    import ezgw.signals as signals
    
    How do I import my own modules?
    Create a directory to put your source code in, e.g., your_module. Next,
    cd your_module
    touch __init__.py
    
    Don't ask questions; just do it. (OK, if you reall want to know, google says, "The __init__.py files are required to make Python treat the directories as containing packages; this is done to prevent directories with a common name, such as string , from unintentionally hiding valid modules that occur later (deeper) on the module search path.") Next, modify your $PYTHONPATH variable (in your bashrc) to point to your new module:
    export PYTHONPATH=PATH_TO_YOUR_MODULE/your_module/:$PYTHONPATH
    
    Now you can import the module like so
    import your_module
    
    Import code from some .py file without creating a directory with a __init__.py file:
    The following command will run everything inside cal_E_iso.py. It will run the import commands at the beginning of this script. It will define variables and functions, which can be called below.
    from cal_E_iso import *
    
    What if I get an error when I run some commonly used module command? For example,
    from scipy.interpolate import interp1d
    ImportError: libptf77blas.so
    
    Often, these errors can be fixed by updating your code:
    pip install scipy --user --upgrade
    

    Plotting

    Make a plot with a legend and readable (large font) labels:
    plt.close('all')
    fig, ax = plt.subplots()
    ax.semilogy(iota, E_iso_A, 'b', label='fiducial')
    plt.axvline(x=thetaj_A, Color='b', LineStyle='--')
    ax.semilogy(iota, E_iso_B, 'r', label=r'$\Gamma=10$')
    ax.semilogy(iota, E_iso_C, 'g', label=r'$\theta_j=10^\circ$')
    plt.axvline(x=thetaj_C, Color='g', LineStyle='--')
    ax.semilogy(iota, E_iso_D, 'k', label=r'$k=2$')
    plt.xlabel(r'$\iota (deg)$', fontsize=18)
    plt.ylabel(r'$E_{iso}$', fontsize=18)
    
    # adjust the font size of the tick mark labels
    plt.xticks(fontsize=myfontsize)
    plt.yticks(fontsize=myfontsize)
    
    ax.legend(fontsize=18)
    plt.savefig('img/plot_iota.png')
    
    adding a vertical line: plt.axvline(x=15)

    Histogram with legend.
    plt.close('all')
    fig, ax = plt.subplots()
    bins = np.linspace(3600, 4000, 20)
    # The variable alpha controls the transparency of each histogram
    ax.hist(-np.array(lzn), bins, alpha=0.5, label='exact')
    ax.hist(-np.array(lzn0), bins, alpha=0.5, label='estimate')
    plt.xlabel('-log(ZN)')
    legend = ax.legend(loc='upper right', shadow=True)
    plt.savefig('img/histo.png')
    
    Scatter plot saved to file.
    fig = plt.figure()
    plt.plot(f22, f33, "bo")
    plt.plot(f22_j0, f33_j0, "rx")
    plt.xlabel(r'$f_{22}$ (Hz)', fontsize=18)
    plt.ylabel(r'$f_{33}$ (Hz)', fontsize=18);
    plt.savefig('f22_f33.png');  plt.savefig('f22_f33.pdf')
    
    Using latex in labels. The following commands will do it...
    plt.rc('text', usetex=True)
    plt.rc('font', family='serif')
    
    ...but an easier solution is to modify your matplotlibrc file to include the following commands:
    backend: Agg
    text.usetex : true
    font.family : Times New Roman
    
    To find your matplotlibrc file, try the following command from the unix command line:
    /home/ethrane/.local/lib/python2.7/site-packages/matplotlib/mpl-data/matplotlibrc
    
    The settings in this file control matplotlib. If you do not have one, you can create one here: ~/.matplotlib/matplotlibrc

    How do I get rid of this matplotlib error?
    ImportError: No module named Tkinter
    
    Make sure that the backend variable is correctly set in your matplotlibrc file
    #backend      : TkAgg
    backend      : Agg
    
    Make sure that you edit the correct matplotlibrc as you may have more than one copy. You can find the one that your python is pointing to by running this command from the unix command line:
    /home/ethrane/.local/lib/python2.7/site-packages/matplotlib/mpl-data/matplotlibrc
    
    Plot with two different horizontal axes.
    plt.close('all')
    fig = plt.figure(figsize=(6,4))
    ax1 = fig.add_subplot(111)
    ax2 = ax1.twiny()
    # N events x-axis
    ax1.plot(np.sqrt(N), 1000*tcut, 'b')
    # normalised inverse distance
    ax2.plot(N, 1000*tcut, 'b')
    ax2.cla()
    # tick function
    def tick_function(X):
    V = N
    return ["%1i" % z for z in V]
    new_tick_locations = np.sqrt(N)
    ax2.set_xlim(ax1.get_xlim())
    ax2.set_xticks(new_tick_locations)
    ax2.set_xticklabels(tick_function(new_tick_locations))
    # clean up
    ax1.set_xlabel(r"410 Mpc / $d$")
        ax2.set_xlabel(r"$N$")
    ax1.set_ylabel(r"t^0_{33}$ (ms)")
    ax1.set_xlim(1, max(np.sqrt(N)))
    plt.savefig('img/tcut.png')
    plt.savefig('pdf/tcut.pdf')
    

    Example Code

    Gravitational-wave spectrum: compare the noise amplitude spectral density of gravitational-wave detectors to the characteristic strain of a binary neutron star signal. This example illustrates: Code


    Expert

    multi-threading: mpirun -n 32 python your_code.py

    what's the difference between .py files and .pyc files? "Python automatically compiles your script to compiled code, so called byte code, before running it. When a module is imported for the first time, or when the source is more recent than the current compiled file, a .pyc file containing the compiled code will usually be created in the same directory as the .py file."


    hdf5, h5

    For information about loading, parsing, and browsing hdf5 files with python, please see the hdf5 page.


    Back to Resources