RANSA User Guide

Overview

This document provides directions for using the C++ code that goes along with the following two papers:

Tsitron J, Ault AD, Broach JR, Morozov AV (2011) Decoding Complex Chemical Mixtures with a Physical Model of a Sensor Array. PLoS Comput Biol 7(10): e1002224. doi:10.1371/journal.pcbi.1002224 pdf

and

Tsitron J, Kreller CR, Sekhar PK, Mukundan R, Garzon FH, Brosha EL, Morozov AV (2013) Bayesian decoding of the ammonia response of a zirconia-based mixed-potential sensor in the presence of hydrocarbon interference. (submitted)

Compilation

Requirements:

To compile the executable `sample', run:

make

or

make sample

Options

-f filename

Load Data file

-model model_type

Choose class of models

DEFAULT: recX_ligY

All Possible modes:

  • Calibration and Inference (GPCR paper):

    recX_ligY

    MULT_SIG_recX_ligY

    MuTOT_recX_ligY

    (X = number of receptors, Y = number of ligands. Both are single digits)

  • Optimization of array design:

    FULL_HESS_recXX_ligYY

    (XX = number of receptors, YY = number of ligands. Both are double digits)

  • Calibration and Inference (gas sensors paper):

    LINEAR_Xrec_Ylig

    NONLIN_CALIB_Xrec

    NONLIN_PRED_Xrec

    (X = number of receptors, Y = number of ligands. Both are single digits)

-gapfile gapfilename

Load concentration `gap' file with dilution steps.

(Not needed in recX_ligY and FULL_HESS_recXX_ligYY modes)

-mfile modelfilename

Load file with fixed model parameters

-pfile priorfilename

Load file with priors for each parameter

DEFAULT:

  • ΔG: Uniform [-20 +5] (All modes in GPCR paper)

  • A, b: Uniform [0 1] (All modes in GPCR paper)

  • V0: Uniform [-1 1] (All modes in gas sensors paper)

  • A (slope): Uniform [0 1] (All modes in gas sensors paper)

  • σ: Jeffreys [0.0001 100.0] (ALL modes)

  • α: Jeffreys [0.0001 100.0] (ALL modes)

  • μtot: Uniform[-10 -2] (All modes in GPCR paper)

  • [TOTAL]: Uniform[0.0001 1000.0] (All modes in gas sensors paper)

-n number of nested objects

DEFAULT: 100

-max number of nesting iterations

DEFAULT: 1000

-mc number of MCMC trials for finding a new object

DEFAULT: 20

-D size of unweighted posterior sample

DEFAULT: 1000

-out posterior_filename

File with unweighted posterior samples

(nested sampling results in a posterior distribution where each sample is weighted by its log likelihood. Here, you can output an unweighted sample generated by MCMC)

-rs random seed

change random seed from default value

To run code for Calibration or Inference

For analysis of data featured in GPCR-based sensor array paper,

Use one of the following Options:

-model recX_ligY

-model MULT_SIG_recX_ligY

-model MuTOT_recX_ligY

You will need a Data file consisting of three columns:

   mu        I      recN
   -3       .99       1
 -3.5       .92       1
   -4       .84       1
   -3       .98       2 
 -3.5       .81       2
   -4       .80       2
  ...       ...      ...    
  ...       ...      ...
  ...       ...      ...    

(where mu = log₁₀[total conc], I = Intensity, and recN = receptor_number)

recX_ligY mode:

MULT_SIG_recX_ligY mode:

MuTOT_recX_ligY mode:

    deltaValue
    0.0
    0.5
    1.0
    1.5
    2.0
    2.5
    3.0
    3.5
    6.0

in this example, the gapfile could correspond to the first column in the following Data file:

  mu           I        recN
  -3           1          1
-3.5        0.98          1
  -4        0.99          1
-4.5        0.77          1
  -5        0.44          1
-5.5        0.14          1
  -6        0.04          1
-6.5        0.02          1
  -9        0.02          1

here, log₁₀[total conc] at the reference point is -3.

For analysis of data featured in gas-sensors paper,

Use one of the following Options:

-model LINEAR_Xrec_Ylig

-model NONLIN_CALIB_Xrec

-model NONLIN_PRED_Xrec

You will need a Data file consisting of three columns:

   TOT       V      recN
    0       .0006         1
    0       .0006         1
    0       .0005         1
   20       .0009         1 
   20       .0009         1
   20       .0009         1
  ...       ...           1
  ...       ...           1 
    0       .0006         2
    0       .0006         2
    0       .0006         2
   20       .0010         2 
   20       .0011         2
   20       .0010         2
  ...       ...           2
  ...       ...           2
  ...       ...          ...    
  ...       ...          ...
  ...       ...          ...    

as well as a corresponding concentration gap file (same as described above). Here, TOT = [TOTAL], V = Voltage Response, and recN = receptor_number). In this example, there are 3 measurements at each concentration.

LINEAR_Xrec_Ylig mode:

NONLIN_CALIB_Xrec mode:

NONLIN_PRED_Xrec mode:

To run code for Optimization of parameters in sensor array design

Use the Option

-model FULL_HESS_recXX_ligYY

Model Files

Any or all of the parameters can be fixed in any mode by including a model file.

For all modes in GPCR paper, use the following format:

  Prm         Val       OnOff       Ind
  ddG          -5         0          1
    A           1         0          1
    B           0         1          1

  ddG          -5         0          2
    A           1         0          2
    B           0         1          2

alpha           1         1          1
sigma       .0001         1          1
mutot          -3         1

You can fix any of the following Prms:

ddG
A
B
alpha
sigma
mutot

and set them to any Val (keeping in mind the priors you are using). OnOff can be either 0 or 1. 1 will fix the given Prm at the indicated Val whereas 0 will allow that parameter to be fit freely. Setting the OnOff switch to 0 is equivalent to excluding that parameter from the model file.

Ind is the index label of a given parameter. Since there are (X × Y = N) ddG's, A's, and B's the Ind of these parameters will run from 1 to N. There are (Y-1) alpha's, X sigma's and only one mutot - so no Ind label is necessary for mutot.

The order of the index labels for ddG, A, and B follows this pattern:

rec# lig#Ind
rec1 lig11
rec1 lig22
rec1 lig33
rec2 lig14
rec2 lig25
rec2 lig36
rec3 lig17
......

The order of the index labels for alpha follows this pattern:

alphaInd
alpha1 = lig2/lig11
alpha2 = lig3/lig12
alpha3 = lig4/lig13
......

The index label for sigma is just the receptor label.

An example of a Model File for LINEAR_Xrec_Ylig mode from gas sensors paper:

  Prm         Val       OnOff       Ind
  V_0        -.0032       1          1
    A        .00089       1          1

  V_0       .00006        1          2
    A       .00058        1          2

alpha           1         0          1
sigma       .0001         0          1
TOTAL         200         0

Where you can fix any of the Prms:

V_0
A
alpha
sigma
TOTAL

Prior Files

Any or all of the priors can be changed by including a prior file in the following format:

  Prm    Val1    Val2   PriorType   Ind 
  ddG   -20.0     0.0       u        1 
    A     0.0     2.0       u        1 
    B     0.0     1.0       u        1 

  ddG   -20.0     0.0       u        2 
    A     0.0     2.0       u        2 
    B     0.0     1.0       u        2 

alpha    0.01   100.0       j        1 
sigma   0.001   100.0       j        1

You can modify the PriorType and/or the lower and upper bounds (Val1 and Val2, respectively) of the prior for any Prm. For the Gaussian prior, Val1 and Val2 are the mean and standard deviation, respectively.

The possible PriorTypes are:

u (Uniform)

j (Jeffreys)

g (Gaussian)

The Ind labels follow the same pattern as described above (see the section on Model Files)

Examples

1. Calibration

To see an example of a Calibration run, you will need to load a Data file (or use the one provided in the ~/RANSA/examples/ directory called UDPglc_H-20.dat):

./sample -f UDPglc_H-20.dat

The end of the output should look like:

dG1 = -6.2547 +- 0.215176
A1 = 0.750912 +- 0.0555299
B1 = 0.0299895 +- 0.0262493
sigma1 = 0.0517271 +- 0.0235276
log(sigma1) = -3.08013 +- 0.504612

To see how parameter estimation improves with an increased number of nested iterations, try:

./sample -f UDPglc_H-20.dat -max 15000

Now, the output should look like:

dG1 = -6.29776 +- 0.0580941
A1 = 0.746728 +- 0.0177048
B1 = 0.0128173 +- 0.00841609
sigma1 = 0.0364009 +- 0.00488333
log(sigma1) = -3.32185 +- 0.130941

If you want to plot a histogram of your results, you will need an unweighted posterior distribution for each parameter. You can output these results with, say, 10,000 samples into a file called Posterior_sample.out. Also, try running with a different random seed to see if your previous results are stable:

./sample -f UDPglc_H-20.dat -max 15000 -out Posterior_sample.out -D 10000 -rs 54345

2. Inference

To see an example of an Inference run, load a Data file and a gapfile (you can use the ones provided in the ~/RANSA/examples/ directory called UDPgal-UDPglc_4rec.dat and deltaFile_4rec.dat) as well as a Model file with all of the ΔG's, A's, b's, and σ's fixed (you can use the one provided called model_4rec_4lig.dat). The objective here is to fit for α's and μtot.

Since we are inputting parameters for four receptors and four ligands in model_4rec_4lig.dat, we need to use the MuTOT_rec4_lig4 mode. UDPgal-UDPglc_4rec.dat is the data collected from four experiments, each using a mixture of lig1 (UDPGal) and lig2 (UDPglc) so we expect that alpha1 = lig2/lig1 should be ≅ 1, and alpha2 = lig3/lig1 and alpha3 = lig4/lig1 should both be ≅ 0. mutot should be ≅ -3. So, if you run:

./sample -f UDPgal-UDPglc_4rec.dat -model MuTOT_rec4_lig4 -gapfile deltaFile_4rec.dat -mfile model_4rec_4lig.dat -max 15000

Your output should include the following lines:

alpha1 = 1.00997 +- 0.0867217

alpha2 = 0.00277454 +- 0.00643838

alpha3 = 0.0114507 +- 0.00107997

mutot = -2.9861 +- 0.0159282

3. Optimization of parameters in sensor array design

For a simple example of an Optimization run, try optimizing array parameters for a one-receptor, two-ligand system. You can use the Model file provided in the ~/RANSA/examples/ directory called model_FULL_HESS_example.dat which fixes the B's to 0, mutot to -3, and alpha to 0.25. The ΔG's and A's are free to be optimized. Run:

./sample -model FULL_HESS_rec01_lig02 -mfile ~/RANSA/examples/model_FULL_HESS_example.dat -max 20000

Your output should include the following lines:

dG1 = -11.4413 +- 0
A1 = 0.999991 +- 0

dG2 = -11.8439 +- 0
A2 = 0.000124153 +- 0

which indicates that this system is optimized when one ligand is a strong agonist (A1 ≅ 1) and the other is a strong antagonist (A2 ≅ 0).