The Bayesian Blend छाया

Detection and classification of exoplanet transits in noisy light curves. Two-stage pipeline: Box Least Squares detection, then a gradient-boosted classifier. Flux only, 28 features, no neural networks, CPU.

Live now: exo.solar-is.app — the detector runs 24/7.  ·  Code and notebooks on GitHub  ·  Submission page

Stars processed7,943 (Kepler, TESS, K2)
Detection PR-AUC0.89
Classification ROC-AUC0.919
Kepler to TESS transfer0.855
Marginal cost per starzero

Team (all from BMSIT)

MemberBackground
Srisha KS (Team Leader)ML Intern, Indian Navy (INICAI). MALLORN Kaggle challenge: F1 0.68, 32/700 private leaderboard (best submission).
Swatantra TiwariML Intern, Indian Navy (INICAI).
Adviktha Kargod PrashasthFounding member, Strategi.
Spandan RayAI Engineer, KlarDataLabs, Zurich.

Track record


Problem

SignalCauseShape
Transitplanet crossing the starflat-bottomed U, no secondary
Eclipsecompanion starV, secondary eclipse
Blenddiluted nearby eclipserodd/even depth mismatch
Otherspots, pulsation, systematicsno coherent transit

Methodology

#ObjectiveComponent
1Detrendspline / Savitzky-Golay
2IdentifyBox Least Squares period search
3Classify28 features, gradient-boosted trees
4SignificanceSDE and SNR vs 7.1 sigma
5Characterizeperiod, depth, duration vs catalog

Dataset

SurveyPlanetFalse posBinaryOtherTotal
Kepler1,1361,82202,0384,996
TESS6562361,59402,486
K229516600461
Total2,0872,2241,5942,0387,943
<div style="color:var(--color-text)"><canvas id=c></canvas></div>
<script>
const cs=getComputedStyle(document.body),txt=cs.getPropertyValue('--color-text')||'#333',sec=cs.getPropertyValue('--color-text-secondary')||'#888',grid=cs.getPropertyValue('--color-border')||'#e0ddd4';
new Chart(document.getElementById('c'),{type:'bar',
 data:{labels:['Kepler','TESS','K2'],datasets:[
  {label:'planet',data:[1136,656,295],backgroundColor:'#5c8a5c'},
  {label:'false pos',data:[1822,236,166],backgroundColor:'#b98a3e'},
  {label:'binary',data:[0,1594,0],backgroundColor:'#b5654a'},
  {label:'other',data:[2038,0,0],backgroundColor:'#6d829e'}]},
 options:{plugins:{legend:{labels:{color:txt,boxWidth:12,font:{size:11}}},title:{display:true,text:'Class distribution by survey',color:txt,font:{size:12}}},
  scales:{x:{stacked:true,ticks:{color:sec},grid:{display:false}},y:{stacked:true,ticks:{color:sec},grid:{color:grid}}}}});
</script>
SurveyBaseline / starCadence
Kepler~33 d (1 quarter)29.4 min
TESS~27 d (1 sector)2 min
K2~77 d (1 campaign)29.4 min

Stage 1: detection, significance, characterization

DetectionValue
PR-AUC0.89
Precision at 7.1 sigma0.94
Recall, single quarter0.53
Recall, full baseline0.945
ClassMedian SDEFraction > 7.1 sigma
Binary13.70.87
Planet5.80.39
False positive5.80.39
Other4.80.06
<div style="color:var(--color-text)"><canvas id=c></canvas></div>
<script>
const cs=getComputedStyle(document.body),txt=cs.getPropertyValue('--color-text')||'#333',sec=cs.getPropertyValue('--color-text-secondary')||'#888',grid=cs.getPropertyValue('--color-border')||'#e0ddd4';
new Chart(document.getElementById('c'),{type:'bar',
 data:{labels:['binary','planet','false pos','other'],datasets:[{data:[0.87,0.39,0.39,0.06],backgroundColor:['#b5654a','#5c8a5c','#b98a3e','#6d829e']}]},
 options:{plugins:{legend:{display:false},title:{display:true,text:'Fraction clearing 7.1 sigma',color:txt,font:{size:12}}},
  scales:{x:{ticks:{color:sec},grid:{display:false}},y:{min:0,max:1,ticks:{color:sec},grid:{color:grid}}}}});
</script>

Planet and false positive have identical significance, so shape classification is required.

Characterization vs KOI / TOI (1,883 planets)Value
Period within 2%, confident0.93
Depth log-correlation0.81
Duration within 30%0.49

Stage 2: classification

MetricValue
Planet vs rest ROC-AUC0.919
Macro 4-class ROC-AUC0.934
Planet vs false positive0.911
PR-AUC0.845
Macro F10.770
True classRecall
Other0.90
Binary0.84
Planet0.71
False positive0.64
<div style="color:var(--color-text)"><canvas id=c></canvas></div>
<script>
const cs=getComputedStyle(document.body),txt=cs.getPropertyValue('--color-text')||'#333',sec=cs.getPropertyValue('--color-text-secondary')||'#888',grid=cs.getPropertyValue('--color-border')||'#e0ddd4';
new Chart(document.getElementById('c'),{type:'bar',
 data:{labels:['snr','peak_ratio','sde','depth','binned_entropy','cid_ce','boxiness','n_transits','durfrac','sine_chi2'],
  datasets:[{data:[.085,.066,.057,.055,.049,.047,.046,.044,.043,.043],backgroundColor:'#6d829e'}]},
 options:{indexAxis:'y',plugins:{legend:{display:false},title:{display:true,text:'Top feature importances (gain)',color:txt,font:{size:12}}},
  scales:{x:{ticks:{color:sec},grid:{color:grid}},y:{ticks:{color:sec,font:{size:10}},grid:{display:false}}}}});
</script>

Boxiness and secondary-eclipse features encode the U versus V shape that separates planets from binaries.

Cross-survey transfer

TrainTestROC-AUCPR-AUC
Kepler + K2TESS0.8550.672
TESS + K2Kepler0.7770.655

TESS is the evaluation survey, so transfer to TESS is the number that matters.


Benchmark vs published methods

MethodDataSamplesBaselineTaskROC-AUC
Shallue 2018Kepler DR2415,737~4 yrbinary0.988
Malik 2022Kepler DR2415,737~4 yrbinary0.948
ExoMiner 2022Kepler DR2530,609~4 yrbinary1.000
Project ChhayaKepler+TESS+K27,943~33 d4-class0.919
<div style="color:var(--color-text)"><canvas id=c></canvas></div>
<script>
const cs=getComputedStyle(document.body),txt=cs.getPropertyValue('--color-text')||'#333',sec=cs.getPropertyValue('--color-text-secondary')||'#888',grid=cs.getPropertyValue('--color-border')||'#e0ddd4';
new Chart(document.getElementById('c'),{type:'bar',
 data:{labels:['Shallue','Malik','ExoMiner','Chhaya'],datasets:[{data:[0.988,0.948,1.000,0.919],backgroundColor:['#c3bfb2','#c3bfb2','#c3bfb2','#6d829e']}]},
 options:{plugins:{legend:{display:false},title:{display:true,text:'ROC-AUC (ours on ~2.5% of the data, 4-class)',color:txt,font:{size:12}}},
  scales:{x:{ticks:{color:sec},grid:{display:false}},y:{min:0.85,max:1,ticks:{color:sec},grid:{color:grid}}}}});
</script>

They classify pre-detected candidates using ~4 years of data. We detect and classify from raw light curves on ~2.5% of the per-star data, on a harder 4-class task. Behind non-network SOTA by 0.037, a gap that closes as baseline is added.


References

Links