PS7 — AI-enabled Detection of Exoplanets from Noisy Astronomical Light Curves
Part of Bharatiya Antariksh Hackathon 2026 (team The Bayesian Blend). See invite, the sibling note ps3-surface-aqi-hcho, and the decision-ps3-vs-ps7 writeup.
The one-line crux (from the ISRO session)
A planet crossing its star dims the star's light by a tiny, periodic amount. The whole problem is: pull that periodic dip out of noisy light curves, then tell a real planet apart from look-alikes (eclipsing binaries, blends, detector artefacts).
Three parameters fully characterise a transit:
- Depth — how much the light drops, $\propto (R_p/R_\star)^2$ (planet size).
- Duration — how long the dip lasts (crossing time).
- Period — how often it repeats (orbital period).
Try it: transit light curve
Slide the parameters. Toggle the V-shape to see why a grazing eclipsing binary masquerades as a planet — the classifier's job is to catch exactly this.
<div style="font-family:system-ui;color:var(--color-text);width:100%">
<canvas id="lc" style="width:100%;height:200px;display:block"></canvas>
<div style="display:flex;flex-wrap:wrap;gap:14px;margin-top:10px;font-size:13px">
<label>Depth <input id="depth" type="range" min="2" max="60" value="20"> <span id="depthV"></span></label>
<label>Duration <input id="dur" type="range" min="2" max="22" value="8"> <span id="durV"></span></label>
<label>Period <input id="per" type="range" min="15" max="70" value="34"> <span id="perV"></span></label>
<label>Noise <input id="noise" type="range" min="0" max="30" value="8"> <span id="noiseV"></span></label>
<label><input id="vshape" type="checkbox"> Eclipsing binary (V-shape)</label>
</div>
<div id="caption" style="margin-top:6px;color:var(--color-text-secondary);font-size:12px"></div>
</div>
<script>
const cv=document.getElementById('lc'), ctx=cv.getContext('2d');
const ids=['depth','dur','per','noise','vshape'], el={};
ids.forEach(i=>el[i]=document.getElementById(i));
['depth','dur','per','noise'].forEach(k=>{const v=store.get(k,null); if(v!==null) el[k].value=v;});
el.vshape.checked = store.get('vshape', false);
const cs=getComputedStyle(document.body);
const accent=cs.getPropertyValue('--color-accent')||'#4af';
const border=cs.getPropertyValue('--color-border')||'#444';
function gauss(){return (Math.random()+Math.random()+Math.random()+Math.random()-2)/2;}
function draw(){
const dpr=window.devicePixelRatio||1;
const W=cv.width=Math.max(1,cv.clientWidth)*dpr, H=cv.height=200*dpr, pad=12*dpr;
ctx.clearRect(0,0,W,H);
const depth=el.depth.value/1000, dur=+el.dur.value, per=+el.per.value, noise=el.noise.value/4000, vsh=el.vshape.checked;
const N=620, ys=[];
for(let i=0;i<N;i++){
let f=1.0;
const phase=((i%per)+per)%per, d=Math.min(phase, per-phase);
if(d<dur){ const x=d/dur; f -= depth*(vsh ? (1-x) : (1-Math.pow(x,6))); }
f += gauss()*noise; ys.push(f);
}
const fmin=Math.min(...ys)-0.0008, fmax=Math.max(...ys)+0.0008;
ctx.globalAlpha=.45; ctx.strokeStyle=border;
ctx.beginPath(); ctx.moveTo(pad,H-pad); ctx.lineTo(W-pad,H-pad); ctx.stroke(); ctx.globalAlpha=1;
ctx.strokeStyle=accent; ctx.lineWidth=1.4*dpr; ctx.beginPath();
ys.forEach((f,i)=>{ const px=pad+(W-2*pad)*i/(N-1), py=pad+(H-2*pad)*(1-(f-fmin)/(fmax-fmin)); i?ctx.lineTo(px,py):ctx.moveTo(px,py); });
ctx.stroke();
el.depthV.textContent=(depth*100).toFixed(1)+'%'; el.durV.textContent=dur; el.perV.textContent=per; el.noiseV.textContent=(noise*100).toFixed(2)+'%';
document.getElementById('caption').textContent = vsh
? 'V-shaped grazing dip → typically an eclipsing binary (false positive), NOT a planet.'
: 'U-shaped flat-bottomed periodic dip → planet transit. Depth ∝ (Rp/R*)².';
['depth','dur','per','noise'].forEach(k=>store.set(k,el[k].value)); store.set('vshape', el.vshape.checked);
}
ids.forEach(i=>el[i].addEventListener('input',draw));
new ResizeObserver(draw).observe(cv); draw();
</script>
Methodology the mentor laid out (5 steps)
- Detrending — remove ramp-like systematic trends from the detector/spacecraft that fake signals; flatten the light curve.
- Periodicity search + phase-folding — guess a period, fold the data so repeated dips stack into one clean signal (BLS / TLS territory).
- Shape characterisation — fit the folded dip; U-shape (flat bottom) ⇒ planet, V-shape (grazing) ⇒ eclipsing binary / false positive.
- AI classifier — given a light curve, classify: transit present? and if so which kind — planet / massive planet / star-on-star / detector artefact.
- Parameter recovery + validation — run the trained classifier on a held-out set, recover depth/period/duration, check against expected significance levels.
Data
- Public: TESS raw light curves — MAST archive. One sector of high-cadence data ≈ 20–30k light curves.
- Provided: a curated labelled set (known exoplanets, false positives, eclipsing binaries) for training. (Mentor confirmed the curated set on TESS in the session.)
Evaluation
- Robustness of the AI classifier (detection + classification accuracy).
- Accuracy of recovered transit parameters vs expected significance.
- Method/approach, visualisation, clarity. 3-page methodology report.
Why this fits me (and where I must differentiate)
- Direct analog: my Mallorn Kaggle work — rare-event classification from noisy astronomical light curves with class imbalance + hard-negative mining. ~70–80% of the hard parts transfer.
- New stages to add: the periodicity/transit search front-end (BLS/TLS) and transit parameter fitting (
batman/transitleastsquares) — Mallorn handed me pre-extracted transients; here I detect them. - Hardware: GPU-optional. XGBoost-on-features needs no GPU; streamable data (download → features → discard), ~30–60 GB/sector, 16 GB RAM fine. Safe even with no GPU.
- Crowd risk: popular, low barrier ⇒ crowded with generalist ML teams. My edge is real but shared — I win only if I lean on the period-search + parameter-fit + uncertainty stages most will skip.
Tools
astropy, lightkurve, transitleastsquares, batman, XGBoost/CatBoost, plus my Mallorn classifier scaffolding.