Including New Parameters:
If you would like to include additional parameters into the sampling that is relatively straightforward. If the parameter was not
present before in the semi-analytic model you will have to include it in the input_***.par file and read it in in read_parameters.c.
In addition, you will have to pass the parameter value from the MCMC to the semi-analytic model in function void read_mcmc_par
(int snapnum). Finally, you will have to include the new parameter in the input file ./input/MCMC_inputs/MCMCParameterPriorsAndSwitches.txt
and increase the value for the number of parameters at the top.
Including New Observational Constraints:
Including a new observational constraint is slightly more complex and basically involves changes in the input files,
mcmc_likelihood() and mcmc_save.c(). The first thing to do is to increase the number of observational constraints defined in
mcmc_vars.h: "#define MCMCNConstraints" and at the top of ./input/MCMC_inputs/MCMCObsConstraints.txt and
./input/MCMC_inputs/MCMCWeightsObsConstraints.txt. Then, the new observational constraint(s) must be included in these two input files.
A type of statistical test needs to be specified together with switches and weights for all output redshifts. Then a file with the same
name specified in these files must be added to the directory ./ObsConstraints/ for all the redshifts to be used.
After the inputs, mcmc_save.c() and mcmc_likelihood() must be changed. If a property not used before by the mcmc is needed, this must
be saved in mcmc_save.c (section mcmc_save below). Then, a method of binning the theoretical predictions and a statistical test must be
included in mcmc_likelihood(). Section mcmc_likelihood below explains how to use the functions already present in the code and how to include new ones.
MCMC files in depth:
The code related to the MCMC sampling is divided into 4 main files:
- mcmc.c: file containing most of the functions that control the mcmc. These include a main function called senna() that
controls the sampling and calls the galaxy formation model, sam(), and analyses the returned value of likelihood. It also contains
functions to read information about the observational constrains and the dark matter sample of trees and to propose new values
for the parameters at each step.
- mcmc_likelihood.c: file containing the functions binning model galaxies and performing tests to compare the model with the
different observational constraints.
- mcmc_save.c: file containing the function that passes the properties needed to compare the model with observations from the
structure inside the galaxy formation model to the structure used by the MCMC: MCMC_GAL (doing the necessary conversion of units).
At the moment the MCMC works with units that include h factors.
- mcmc_vars.h: all variables related to the mcmc are defined here, including the galaxy structure used by the mcmc: MCMC_GAL.
In addition, mcmc_proto.h contains the headers for all the functions used in the MCMC.
mcmc.c:
mcmc.c contains most functions used by the MCMC sampling including the 'main' function for the method: Senna(). In MCMC mode, Senna() is
the function controlling the entire code (instead of main.c used in normal mode).
The first function to be called inside Senna() is initialize_mcmc_par_and_lhood(), which reads the initial values for all the parameters and
a likelihood either from a previous output (in ./output/senna_g*_**.txt) or from MCMCStartingParFile. Next, read_observations() reads all the
observational constraints into memory. initialize_mcmc_par_and_lhood() and read_observations() are read only once at the beginning.
read_sample_info() reads information about the IDs and weights of the pre selected sample of FOF groups. In principle this could be read only
once and that is in fact the case if only one simulation is used. If more than one simulation is used, currently if OPT += -DMR_PLUS_MRII,
the function must be called twice at each MCMC step to switch between simulations.
After the initialization has been done, the MCMC sampling starts. In a for loop that goes from 1 to ChainLength the semi-analytic model is run at
each step, with galaxy properties being computed for a given set of parameters (passed from the mcmc to the semi-analytic model at read_mcmc_par()),
and mcmc_likelihood() being called in the end to return a joint likelihood for the model at the current step.
If the likelihood for the model at the current step is higher than at the previous step, the current set of parameters is accepted. If the
likelihood is lower than in the previous step, the new set of parameter might still be accepted, with a probability given by the ratio of
the two likelihoods. In any new case, a new set of parameters is proposed in propose_new_parameters() and passed into the semi-analytic
model to re-start the process.
mcmc_likelihood.c:
mcmc_likelihood.c is the place where theoretical predictions are binned like observational functions and then compared with them by means of a
statistical test that returns a likelihood. There are switches for the different observational constraints and the different redshifts that
define if a given test is performed or not.
There are currently three tests available: χ2, maximum likelihood and binomial (although the binomial likelihood is not correctly calculated yet).
You can easily include more tests in this function by looking at how the current ones are implemented.
If you want to include additional constraints using the tests already provided you just need to bin the property you want to compare with
observations in the correct way. If you want to compare a luminosity or mass function of a 'simple property', then you just need to make
sure that property is saved in mcmc_save.c and use bin_function and the chi_square test. However in many cases you will need to include
your own bin function. A detailed description of the tests can be found in Henriques et al., 2013, MNRAS, 431, 3373).
For example, in the example already included for the stellar mass function of red and blue galaxies, it is necessary to divide galaxies into a red and blue
population according to some criteria before binning.
mcmc_save.c:
This file is where the code checks if the current galaxy is in an FOF group that was pre-selected (the tree file contains all
the FOFs in a tree, a bush). If that is the case galaxy properties are passed from HaloGal to MCMC_GAL. If you need additional
properties, that are computed by the galaxy formation model but not listed here, to constrain the model with additional observations
they must be passed between structures here. Additional h factors are included here. MCMC_GAL has galaxy properties for all
the galaxies at all the output redshifts. The maximum number of galaxies much be pre-defined (MCMCAllocFactor) as it varies
from redshift to redshift.
mcmc_vars.h:
This file contains the structure MCMC_GALAXY with all the galaxy properties available to the mcmc_likelihood() to compare with observational
constraints. If you need additional properties, that are computed by the galaxy formation model but not listed here, to constrain the model
with additional observations they must be included in this structure.