The Bayesian Blend छाया
Detection and classification of exoplanet transits in noisy light curves. Two-stage pipeline: Box Least Squares detection, then a gradient-boosted classifier. Flux only, 28 features, no neural networks, CPU.
Live now: exo.solar-is.app — the detector runs 24/7. · Code and notebooks on GitHub · Submission page
| |
|---|
| Stars processed | 7,943 (Kepler, TESS, K2) |
| Detection PR-AUC | 0.89 |
| Classification ROC-AUC | 0.919 |
| Kepler to TESS transfer | 0.855 |
| Marginal cost per star | zero |
Team (all from BMSIT)
| Member | Background |
|---|
| Srisha KS (Team Leader) | ML Intern, Indian Navy (INICAI). MALLORN Kaggle challenge: F1 0.68, 32/700 private leaderboard (best submission). |
| Swatantra Tiwari | ML Intern, Indian Navy (INICAI). |
| Adviktha Kargod Prashasth | Founding member, Strategi. |
| Spandan Ray | AI Engineer, KlarDataLabs, Zurich. |
Track record
- Smart India Hackathon 2025: Winners. Space Applications Centre (ISRO) problem on surface-level O3 and NO2 forecasting. Mentored by Imran Girach, Scientist, SAC (ISRO), Ahmedabad.
- MALLORN Challenge (Kaggle, 2025 to 2026). Photometric classification of astronomical light curves on Rubin / LSST simulations. Srisha KS: F1 0.68, 32/700 private leaderboard (best submission). Same task modality as this problem.
Problem
| Signal | Cause | Shape |
|---|
| Transit | planet crossing the star | flat-bottomed U, no secondary |
| Eclipse | companion star | V, secondary eclipse |
| Blend | diluted nearby eclipser | odd/even depth mismatch |
| Other | spots, pulsation, systematics | no coherent transit |
Methodology
| # | Objective | Component |
|---|
| 1 | Detrend | spline / Savitzky-Golay |
| 2 | Identify | Box Least Squares period search |
| 3 | Classify | 28 features, gradient-boosted trees |
| 4 | Significance | SDE and SNR vs 7.1 sigma |
| 5 | Characterize | period, depth, duration vs catalog |
Dataset
| Survey | Planet | False pos | Binary | Other | Total |
|---|
| Kepler | 1,136 | 1,822 | 0 | 2,038 | 4,996 |
| TESS | 656 | 236 | 1,594 | 0 | 2,486 |
| K2 | 295 | 166 | 0 | 0 | 461 |
| Total | 2,087 | 2,224 | 1,594 | 2,038 | 7,943 |
<div style="color:var(--color-text)"><canvas id=c></canvas></div>
<script>
const cs=getComputedStyle(document.body),txt=cs.getPropertyValue('--color-text')||'#333',sec=cs.getPropertyValue('--color-text-secondary')||'#888',grid=cs.getPropertyValue('--color-border')||'#e0ddd4';
new Chart(document.getElementById('c'),{type:'bar',
data:{labels:['Kepler','TESS','K2'],datasets:[
{label:'planet',data:[1136,656,295],backgroundColor:'#5c8a5c'},
{label:'false pos',data:[1822,236,166],backgroundColor:'#b98a3e'},
{label:'binary',data:[0,1594,0],backgroundColor:'#b5654a'},
{label:'other',data:[2038,0,0],backgroundColor:'#6d829e'}]},
options:{plugins:{legend:{labels:{color:txt,boxWidth:12,font:{size:11}}},title:{display:true,text:'Class distribution by survey',color:txt,font:{size:12}}},
scales:{x:{stacked:true,ticks:{color:sec},grid:{display:false}},y:{stacked:true,ticks:{color:sec},grid:{color:grid}}}}});
</script>
| Survey | Baseline / star | Cadence |
|---|
| Kepler | ~33 d (1 quarter) | 29.4 min |
| TESS | ~27 d (1 sector) | 2 min |
| K2 | ~77 d (1 campaign) | 29.4 min |
Stage 1: detection, significance, characterization
| Detection | Value |
|---|
| PR-AUC | 0.89 |
| Precision at 7.1 sigma | 0.94 |
| Recall, single quarter | 0.53 |
| Recall, full baseline | 0.945 |
| Class | Median SDE | Fraction > 7.1 sigma |
|---|
| Binary | 13.7 | 0.87 |
| Planet | 5.8 | 0.39 |
| False positive | 5.8 | 0.39 |
| Other | 4.8 | 0.06 |
<div style="color:var(--color-text)"><canvas id=c></canvas></div>
<script>
const cs=getComputedStyle(document.body),txt=cs.getPropertyValue('--color-text')||'#333',sec=cs.getPropertyValue('--color-text-secondary')||'#888',grid=cs.getPropertyValue('--color-border')||'#e0ddd4';
new Chart(document.getElementById('c'),{type:'bar',
data:{labels:['binary','planet','false pos','other'],datasets:[{data:[0.87,0.39,0.39,0.06],backgroundColor:['#b5654a','#5c8a5c','#b98a3e','#6d829e']}]},
options:{plugins:{legend:{display:false},title:{display:true,text:'Fraction clearing 7.1 sigma',color:txt,font:{size:12}}},
scales:{x:{ticks:{color:sec},grid:{display:false}},y:{min:0,max:1,ticks:{color:sec},grid:{color:grid}}}}});
</script>
Planet and false positive have identical significance, so shape classification is required.
| Characterization vs KOI / TOI (1,883 planets) | Value |
|---|
| Period within 2%, confident | 0.93 |
| Depth log-correlation | 0.81 |
| Duration within 30% | 0.49 |
Stage 2: classification
| Metric | Value |
|---|
| Planet vs rest ROC-AUC | 0.919 |
| Macro 4-class ROC-AUC | 0.934 |
| Planet vs false positive | 0.911 |
| PR-AUC | 0.845 |
| Macro F1 | 0.770 |
| True class | Recall |
|---|
| Other | 0.90 |
| Binary | 0.84 |
| Planet | 0.71 |
| False positive | 0.64 |
<div style="color:var(--color-text)"><canvas id=c></canvas></div>
<script>
const cs=getComputedStyle(document.body),txt=cs.getPropertyValue('--color-text')||'#333',sec=cs.getPropertyValue('--color-text-secondary')||'#888',grid=cs.getPropertyValue('--color-border')||'#e0ddd4';
new Chart(document.getElementById('c'),{type:'bar',
data:{labels:['snr','peak_ratio','sde','depth','binned_entropy','cid_ce','boxiness','n_transits','durfrac','sine_chi2'],
datasets:[{data:[.085,.066,.057,.055,.049,.047,.046,.044,.043,.043],backgroundColor:'#6d829e'}]},
options:{indexAxis:'y',plugins:{legend:{display:false},title:{display:true,text:'Top feature importances (gain)',color:txt,font:{size:12}}},
scales:{x:{ticks:{color:sec},grid:{color:grid}},y:{ticks:{color:sec,font:{size:10}},grid:{display:false}}}}});
</script>
Boxiness and secondary-eclipse features encode the U versus V shape that separates planets from binaries.
Cross-survey transfer
| Train | Test | ROC-AUC | PR-AUC |
|---|
| Kepler + K2 | TESS | 0.855 | 0.672 |
| TESS + K2 | Kepler | 0.777 | 0.655 |
TESS is the evaluation survey, so transfer to TESS is the number that matters.
Benchmark vs published methods
| Method | Data | Samples | Baseline | Task | ROC-AUC |
|---|
| Shallue 2018 | Kepler DR24 | 15,737 | ~4 yr | binary | 0.988 |
| Malik 2022 | Kepler DR24 | 15,737 | ~4 yr | binary | 0.948 |
| ExoMiner 2022 | Kepler DR25 | 30,609 | ~4 yr | binary | 1.000 |
| Project Chhaya | Kepler+TESS+K2 | 7,943 | ~33 d | 4-class | 0.919 |
<div style="color:var(--color-text)"><canvas id=c></canvas></div>
<script>
const cs=getComputedStyle(document.body),txt=cs.getPropertyValue('--color-text')||'#333',sec=cs.getPropertyValue('--color-text-secondary')||'#888',grid=cs.getPropertyValue('--color-border')||'#e0ddd4';
new Chart(document.getElementById('c'),{type:'bar',
data:{labels:['Shallue','Malik','ExoMiner','Chhaya'],datasets:[{data:[0.988,0.948,1.000,0.919],backgroundColor:['#c3bfb2','#c3bfb2','#c3bfb2','#6d829e']}]},
options:{plugins:{legend:{display:false},title:{display:true,text:'ROC-AUC (ours on ~2.5% of the data, 4-class)',color:txt,font:{size:12}}},
scales:{x:{ticks:{color:sec},grid:{display:false}},y:{min:0.85,max:1,ticks:{color:sec},grid:{color:grid}}}}});
</script>
They classify pre-detected candidates using ~4 years of data. We detect and classify from raw light curves on ~2.5% of the per-star data, on a harder 4-class task. Behind non-network SOTA by 0.037, a gap that closes as baseline is added.
References
- Shallue, C. J. and Vanderburg, A. 2018, AJ 155, 94 (AstroNet). arXiv:1712.05044
- Malik, A., Moster, B. P. and Obermeier, C. 2022, MNRAS 513, 5505. arXiv:2011.14135
- Valizadegan, H. et al. 2022, ApJ 926, 120 (ExoMiner). arXiv:2111.10009
- Kovacs, G., Zucker, S. and Mazeh, T. 2002, A&A 391, 369 (Box Least Squares). arXiv:astro-ph/0206099
- Hippke, M. and Heller, R. 2019, A&A 623, A39 (Transit Least Squares). arXiv:1901.02015
Links