Created at 9am, Mar 19
Ms-RAGScience
0
Logistic regression to boost exoplanet detection performances
cuc3EoesEvx-BxWL9J8zauOxQm1y46uXz4-H8XezVOw
File Type
PDF
Entry Count
77
Embed. Model
jina_embeddings_v2_base_en
Index Type
hnsw

Hadrien Cambazard1, Nicolas Catusse1, Antoine Chomez2,3, Anne-Marie Lagrange2,3, Pierre Vieu21- Univ. Grenoble Alpes, CNRS, Grenoble INP, G-SCOP, F-38000 Grenoble, France2- Laboratoire d’Études Spatiales et d’Instrumentation en Astrophysique, Observatoire de Paris, Univ. PSL, Sorbonne Univ., Univ. Paris Diderot, France3- Univ. Grenoble Alpes, Institut de Planétologie et d’Astrophysique de Grenoble, France 19 March 2024ABSTRACTDirect imaging of exoplanets requires to separate the background noise from the exoplanet signals. Statistical methods have been recently proposed to avoid subtracting any signal of interest as opposed to initial self-subtracting methods based on Angular Differential Imaging (ADI). However, unless conservative thresholds are chosen to claim for a detection, such approaches tend to produce a list of candidates that include many false positives. Choosing high, conservative, thresholds leads to miss the faintest planets. We extend a statistical framework with a logistic regression to filter the list of candidates. Features with physical/optical meaning (in two wavelengths) are used, leading to a very fast and pragmatic approach. The overall method requires a simple edge detection (image processing) and clustering algorithm to work with sub-images. To estimate its efficiency, we apply our approach to targets observed with the ESO/SPHERE high contrast imager, thatwere previously used as tests for blind surveys. Experimental results with injected signals show that either the number of false detections is considerably reduced or faint exoplanets that would otherwise not be detected can be sometimes found. Typically, on the blind tests performed, we are now able to detect around 50% more of the injected planets with an SNR below 5, and with a very low number of additional candidates.Keywords: Exoplanets. Techniques: high angular resolution – techniques: image processing – methods: data analysis

BT4 #TP 243 HIP12394-4M 214 HIP12394-3M 214 192 HIP1993 HIP107345 #TN 15076 15540 15540 15791 #TP 34 34 34 34 #U 2 2863 8384 7707 2421 Table 4. Number of positive (planets injection) and negative stamps (extracted from a pure noise map generated with algorithm 1). The total number as well as the number of negative stamps of SNR H2 greater than 2 are reported. were chosen so that their contrasts be along the 5sigma contrast curves to test the capability of the classifier. We chose to use the classifier on HIP12394 with 3Mjup and 4Mjup injected planets because both planets cross the contrast curve between 0 and 1 arcsec. on the entire field with an additional 0/1 indicative feature ( 9 in the Appendix) defining whether a stamp is or not in the star halo. In other words, this indicative feature tells when MeanSpec is relevant and can be eventually help the classifier. 3.3 Features analysis
id: 78e0a30d1f74c8e9b9b2c4b48416219b - page: 6
We analyse the distribution of six of the features (MeanSnr, MeanGra, MaxGra, MaxMin, AiryFig, MeanSpec) across true positive and true negative stamps. We recall that a feature is simply a real number computed on a stamp and a good feature is correlated to a class (positive/negative). Figure 4 provides, for each feature, two box plots showing the distribution of its values for true positive (box labeled pos) and true negative (box labeled neg) stamps. A box plot gives a summary of the distribution in 5 numbers from bottom to top: minimum, first quartile, median, third quartile and maximum. Typically, the values of the feature for 50% of the stamps lie in the box. The median value is the orange horizontal line within the box. Values considered as outliers (below or above 1.5 the inter-quartile range, which is defined as the maximum) are not shown for sake of clarity. The correlation coefficient (r value) is given for each feature. We expect useful features to show distinct distributio
id: 132c31c4276f1e10378a41ad7020c4dc - page: 6
The size of the intersection of the two distributions (for the positive and negative class) gives an idea of the discrimination power of the feature.
id: a53674dbe838b8ec0e1be76b7cac3b5e - page: 6
Four features appear decisive (correlation coefficient 0.6): the mean snr intensity (MeanSnr), the gradient features (MeanGra, MaxGra) as well as the feature related to the presence of an Airy Figure (AiryFig). The feature related to speckles is not strongly correlated to the presence of a companion ( = 0.29). This is expected as speckles are only present within the star halo. Restricting the analysis to the halo region, between 30 and 140 pixels (i.e. 370 to 1700 mas), significantly increases the correlation to = 0.61 (see Figure 5). It might therefore be appropriate to build two distinct classifiers, one for the star halo that includes MeanSpec and one for the remaining area without it. But this tends to complicate further the overall process. We decided to keep it simple for the moment and used the feature MeanSpec
id: 8cda2715bc225069d669cfd16b19bdc2 - page: 6
How to Retrieve?
# Search

curl -X POST "https://search.dria.co/hnsw/search" \
-H "x-api-key: <YOUR_API_KEY>" \
-H "Content-Type: application/json" \
-d '{"rerank": true, "top_n": 10, "contract_id": "cuc3EoesEvx-BxWL9J8zauOxQm1y46uXz4-H8XezVOw", "query": "What is alexanDRIA library?"}'
        
# Query

curl -X POST "https://search.dria.co/hnsw/query" \
-H "x-api-key: <YOUR_API_KEY>" \
-H "Content-Type: application/json" \
-d '{"vector": [0.123, 0.5236], "top_n": 10, "contract_id": "cuc3EoesEvx-BxWL9J8zauOxQm1y46uXz4-H8XezVOw", "level": 2}'