Open Source‎ > ‎

TARDIS SN GSoC Proposal

Parameter Fitting Search Algorithm
Last updated: March 27th, 2015 at 16:01 UTC.

Organization: TARDIS SN

Abstract: This project implements different search algorithms to select optimal TARDIS simulation parameters. Optimal simulation parameters would produce a similar synthetic spectrum. The solutions to be implemented include PSO, evolutionary algorithms, gradient descent, differential evolution and neural networks. The implemented search methods will be tested on HPC facilities and gradually updated improve its performance.

Additional info: http://decltypeme.i2msoft.com/open-source/tardis-sn-gsoc-proposal

Proposal Contents:

1. Project at a glance
2. Project goals
3. Analysis and implementation details
4. Tentative timeline
5. About me
6. Link for more details


1. Project at a glance:

The projects involve developing different search methods to tune simulation parameters to be able to produce spectra resembling those of real supernovae. This project involves preparing a model for representation of candidate solution. Then, the fitness criteria will be implemented that will be used to judge whether a specific candidate is good, outperforms other candidates or not.

After that, several algorithms will be implemented. Some of them will be iterative numerical search methods such as hill climbing or gradient descent. Some others will be evolutionary methods such as evolutionary algorithms, genetic algorithms, simulated annealing. Others will be trained like neural networks that can serve as an online parameter estimation module.

Each method will be tested on HPC facilities with different data-sets to assess its performance.


2. Project Goals:

2.1. Implement different approximation methods to find optimal simulation parameters.

2.2. Asses each method against different real supernova spectra and report on each one's performance.

2.3. Provide documentation and unit tests to all implemented methods.


3. Analysis and implementation details:

Doing research into an optimization problem will involve different steps beginning from precisely defining the optimization objective passing by the representation and the optimization method that will be used.

3.1. Model as a differentiable function

One of the first candidate optimization methods to an optimization problem is to try to find a cost formula that is differentiable and start using gradient descent or hill climbing methods to minimize it. This will require efficient vectorization of parameters and coefficients to guarantee high performance. Many arithmetic operations will be done in vector(matrix) forms and make use of multi-threading and parallel programming whenever possible.

If the representation of the optimization objective is follows a well-known optimization paradigm like convex optimization, specific algorithms that already proved efficiency empirically will be implemented.

3.2. Evolutionary algorithms

In this method, the genotypic representation of the candidate solution will be mainly bit-string representing floating point numbers while the phenotypic representation is our simulation parameters.

Implementing this method would require testing different selection methods, crossover rates and mutation rates. We may decide either to use an evolutionary algorithms library and adapt it to this problem or build a set of evolutionary algorithms and use it. Whenever possible, we shall make use of parallel programming and multi-threading.

3.3. Simulated Annealing

Simulated annealing is a good way to randomly search through search space by changing the state of the current candidate or set of candidates. Simulated annealing will be easy to implement and test. However, I believe gradient descent will outperform simulated annealing in this problem because it is a more directed(by gradient vector) change of state than simulated annealing.

3.4. Particle Swarm Optimization

The most challenging problem in this method will be choosing the distribution function that initializes the population and the fitness criteria.

3.5. Neural Networks

Neural networks will be used as a black-box function that will determine our simulation parameters giving a specific input. We will develop the neural networks at the end. We can have neural networks to be trained online. This will be beneficial because we at this stage we will have numerous test data generated by the previous phases. These data can be used as a training dataset for the neural networks.

Because methods depending on the gradient vector and differentiability of the cost function is more promising here because of the continuous nature of the search domain, they are given more time in the project schedule than random and evolutionary search methods.


4. Tentative Timeline:

I will finish my spring semester on 27th May and my fall Semester starts 1st September. I will be fully dedicated during the summer vacation to the GSoC project with around 8 hours/day for five days a week (40hrs/week). Here is a tentative schedule of the project. I am also ready to write frequent blog posts documenting what I do.

 

Tentative Project Schedule
Community Bonding Period:
April 27th - May 27th
-Getting to know the TARDIS community
-Exploring TARDIS in depth
Week 01: 27th May - 3th June-Examining the problem in details
-Report detailed formal definition of the optimization objective
and the fitness criteria
-Study the problem mathematically and
categorize it(e.g. convex optimization ...)

 

Week 02: 3rd June - 10th June
Week 03: 10th June - 17th June-Code gradient descent and other relevant 
numerical methods to solve the problem like hill climbing
-Test implemented numerical methods and fix bus if any
-Document this phase
Week 04: 17th June - 24th June
Week 05: 24th June - 1st July-Implement evolutionary algorithms for the problem
-Test different selection methods, crossover and mutation rates
-Document this phase and report bugs if any
Week 06: 1st July - 8th July
Week 07: 8th July - 15th July-Implement simulated annealing and PSO for the problem
-Test the two methods
-Document this phase and report bugs if any
Week 08: 15th July - 22nd July
Week 09: 22nd July - 29th July-Implement different neural networks to solve the problem
-Train neural networks using data from previous methods
-Test against new data
-Miscellaneous code fixes and tests
Week 10: 22nd July - 29th July
Week 11: 29th July - 5th August
Week 12: 5th August - 12th August-Continue with miscellaneous code fixes and tests
-Fixing any bugs
-Finalize testing, documentation and reporting-Shipping Code
Week 13: 12th August - 19th August
Pencils Down: 21st August Firm Pencils Down: Project wrap up

5. About Me:

Name: Islam Faisal Ibrahim
University: The American University in Cairo
Major: Computer Science and Mathematics
Contact email: islamfm@aucegypt.edu
Website: decltypeme.i2msoft.com 

Short Bio:

I am an undergraduate student double majoring in computer science and mathematics. I worked as an undergraduate teaching assistant since my first semester at college. I spent the summer of 2014 at Nile University where I worked as a summer trainee on machine learning at the Ubiquitous & Visual Computing Group (UbiComp).

Why I believe I match this project?

Genetic algorithms:
I have been working with GAs for over two years now. Currently, I am doing research into the selection parameters of GAs to have selection parameters that cluster the population before performing selection.

Optimization:
Since I am a mathematics major, I believe I have sufficient knowledge to survey different mathematical optimization algorithms and implement them.

Software Development:
I have been working with C++ software development for more than two years. I used C++ in various academic projects and during my previous internship at Nile University. I took a course in Object-Oriented Programming, C++11 and socket programming in Fall 2014. I worked as an undergraduate teaching assistant for a data structures and algorithms course in C++. I have basic knowledge programming in Python.

My detailed CV:
decltypeme.i2msoft.com/cv
My academic courses:
decltypeme.i2msoft.com/education


6. More details:

For more details about the project, any updates or thoughts that I may add to the project, they will be added to this webpage.
http://decltypeme.i2msoft.com/open-source/tardis-sn-gsoc-proposal

Comments