Image Analysis Competitions for Cosmological Lensing

mailto:tdk@roe.ac.uk?subject=ESNested%20Question
mailto:tdk@roe.ac.uk?subject=ESNested%20Question

The HelpDesk 

If you need assistance with any aspect of GREAT10 please email the helpdesk at

great10.helpdesk@gmail.com

We encourage all participants to thoroughly read the Challenge Handbook

The Downloads

Q. Why is there so much data in GREAT10 Galaxy Challenge?

A. This is necessary to achieve the accuracy required and also assess sensitivity to a few different observational conditions.


Q. How long does it take to download everything?

  1. A.Most of the data volume is in the GREAT10 Galaxy Challenge which is approx 900Gb in Total

  2. For a 10 MB/s download speed this will take ~ 26 hours or 1 day

  3. For 5MB/s this will take ~ 51 hours or 2 days

  4. For 1 MB/s this will take ~ 260 hours or 10 days

The exact speed experienced will depend on location. Please see the download page here for a list of mirrors, please choose the central node as preference or the mirror that is geographically closest to you.


Q. Do I need to download everything to take part in the Galaxy Challenge?

A. You need to download all of the galaxy challenge galaxy images to take part 450Gb. If you want to use the functional form of the PSF, not the star images, you do not need to download the star images.


Q. Do I need to download both the tarballs and the fits files?

A. No, just one or the other.


Q. Where did the torrents go?

A. We haven't provided torrents for v1.0. Let us know if you think it would be useful.


Q. Would you post me a hard disk with the data on?

A. There are a small number of hard disks in circulation. Please get in contact via the helpdesk if you want a disk posted to you. By agreeing to receive the disk you are agreeing to pay and organise (in a timely way) to post it on to someone else (within the same continent). Thank you.


The Images 

Q. Is there any training data for the star challenge? 

A. Not explicitly, but the galaxy challenge stars where the PSFs are provided can be used to train for the star challenge.


Q. How is the Airy profile in sets 21 and 22 of the Galaxy challenge exactly defined? 

A. There is a new description on the website here 

http://great.roe.ac.uk/great10challenge/Galaxy_Information.html#widget2 

there is also some SuperMongo code here that Prof. Kujken has provided that was used in Kuijken (2006) here 

http://great.roe.ac.uk/data/code/sm/Airy.sm


Q. How is the Star Quality factor defined exactly? 

A. There is now a description here http://great.roe.ac.uk/great10challenge/Star_Information.html


Q. The PSF for the stars in the Galaxy Challenge centred at x=4776 has an ellipticity which does fit a smooth functional form, can I ignore this data? 

A. Yes, this was a slight error in the PSF creation for some images (for the Galaxy Challenge not the Star Challenge), that was only identified after the challenge launch. Please ignore the x=4776 stars if fitting a smooth functional form to the PSF in the Galaxy Challenge. We will not use the x=4776 galaxies to calculate the Quality factor. 


Q. What is the pixel scale for the star challenge submission?

A. The pixel scale of the submitted fits cubes should be the same as the pixel scale in the images. If you wish to submit a higher resolution model contact the great10 helpdesk to make special arrangements. 


Q. Do images 33 and 68 in Galaxy challenge set 5 have a glitch? 

A. Images 33 and 68 in set_5/ had some glitches that have been fixed please re-download from here 

http://great.roe.ac.uk/data/g10_v1/objs/set_5/g10_v1_t5_1_objs_33.fits

http://great.roe.ac.uk/data/g10_v1/objs/set_5/g10_v1_t5_1_objs_68.fits


Q. The fits cubes for the star challenge upload are large using gzip. 

A. We now allow for a fits compression using fpack http://heasarc.gsfc.nasa.gov/fitsio/fpack/ this is a lossy compression algorithm, but has a negligible effect on gravitational lensing quantities. This reduces the tar'ed file size to ~4000MB in total, which at an upload speed of 1MB/s would take ~ 1 hour upload (but you should get a faster upload than that). 


Q. I find that the coordinate system is flipped x->y with respect to what I am using. 

A. We find, from multiple participants, that there are a number of conventions that people use which are related to the GREAT10 convention by :  i) x <-->y ii) FWHM --> FWHM / (1 + e1^2 + e2^2 )

We allow the submission of any of these conventions for both the star and galaxy challenges. The quality factor code will try each convention in turn (i, ii, i+ii) and presents the maximum quality factor. 





Q. Does each different star image correspond to a different galaxy image?

A. Yes. Every galaxy images has a corresponding star image. These should be identifiable using the files names.


Q. Why so many sets of images?

A. These sets represented repeated observations of a patch of sky, like in real observations. The shear variation for each set is the same over all the images. This helps to test the averaging processes and also helps to reduce the impact of noise on the final quality factors.


Q. What is the difference between the GREAT10 Training data and the full set?

A. The only difference the type of simulated observing condition. It should be possible to get a large quality factor with the training data alone.


Q. Is every set the same, what changes between the sets?

  1. A.Not every set is the same. In each set we have varied a particular observing condition. The is representative of real data where we can identify different observing conditions and galaxy properties (for example high or low signal-to-noise) before the shear is estimated.


Q. Why are you using fits files?

A. This is the standard format for astronomical images. It is basically a binary file of the image values, and is therefore relatively compact (it also contains header information in a standard format, which is usually used to store information about when the images were taken etc). There is a fair bit of support for fits files.


Q. How can I view a fits file?

A. Either read it into your usual program, or you can use standard software such as ds9.


Q. How do I read in a fits file into my program?

A. Please see http://fits.gsfc.nasa.gov/ for more information. e.g. they can be read into matlab using image=fitsread('filename.fits'). There are also some example codes for reading in the images.


Q. Where can I find out more about fits files?

A. FITS (Flexible Image Transport System) format is defined in Astronomy and Astrophysics, volume 376, page 359; DOI 10.1051/0004-6361:20010923.


Q. Why are the star images at random positions on the image?

A. In principle this provides high resolution information on the PSF, which is required for many methods.


Q. Do I need to use the star images and also the analytic expressions?

A. It is not necessary to use both. We suggest you use one or the other. We provide both to try to make it more convenient for you.


Q. How do you pixelise the images?

A. A square grid is made, and all the light falling in a square grid element is summed to give the values you receive in the fits files. This reflects the realistic case in an astronomical detector. Therefore this is equivalent to a convolution with a top hat square, followed by a sampling. In principle this convolution could be considered a part of the PSF.


Q. What is the read noise and gain of the images?

A. You can consider them to be a read noise of 0 and a gain of 1. Gaussian, Homoscedastic, noise has been added to all images, such that the objects have a particular signal-to-noise. There is no non-stationary noise (dark or light areas having more noise).


Q. Why are the maximum values of the pixels low?

A. The images have been scaled to a particular signal-to-noise per image. This is defined as the integrated objects intensity divided by the variance of the Gaussian noise added to the images.


Q. Can I have more Training data to develop my method on?

A. We will say no in the first instance, unless a convincing case can be made. The GREAT10 training set should be sufficient to get a high quality factor.


Q. I am still developing my method and it fails randomly on some galaxies. How bad is it if I just ignore some of the galaxies in an image?

A. You should attempt to provide a shear estimation for every galaxy, if you choose to submit a shear catalogue. Removing galaxies effectively introduces a mask and the method used to generate the power spectrum from the shear catalogue assumes no masks. We cannot guarantee that randomly removed galaxies will not impact your quality factor, and most likely will degrade your results. Alternatively you can submit power spectrum results if you can remove the effect of masked galaxies using alternative power spectrum methods.


Q. I am still developing my method and it fails (not necessarily randomly) on some galaxies. How bad is it if I just ignore some of the galaxies in an image?

A. You should attempt to provide a shear estimation for every galaxy, if you choose to submit a shear catalogue. Removing galaxies effectively introduces a mask and the method used to generate the power spectrum from the shear catalogue assumes no masks. We cannot guarantee that randomly removed galaxies will not impact your quality factor, and most likely will degrade your results. Alternatively you can submit power spectrum results if you can remove the effect of masked galaxies using alternative power spectrum methods.


Q. I am still developing my method and it fails randomly on some stars. How bad is it if I just ignore some of the stars in an image?

A. This will mean that you are not using all information available in the PSF. The PSF varies across the images, hence if you ignore some of this variation it will most likely bias your results and degrade your quality factor.  


Q. When I look at the images I can see that sometimes a large ellipse-shaped galaxy gets cut off at the edge of the postage stamp. Does this mean that there is a bug in the images?

A. No. We have divided the images up into independent postage stamps to make the challenge easier. We have kept the postage stamps small to keep the total data volume down. We could have made the galaxies smaller and less elliptical to make the challenge even easier. However we made a decision to include the handling of missing data (the parts of the galaxy outside the postage stamp) as a part of the challenge. (In real images there are also missing pieces of information e.g. due to detector defects or cosmic rays.)


Q. What is the coordinate system used for the star challenge centroids?

  1. A.All coordinates are Cartesian, in units of pixels, and we define the origin (0, 0) as the bottom left corner of the bottom left pixel.  This means that the centre of this bottom left pixel is defined as being at (0.5, 0.5). However we note that this convention is not always used.  SExtractor, and DS9 (commonly used software tools in Astronomy) define the bottom left corner as (0.5, 0.5) by default, so that the lowest left pixel centre is at (1, 1).


Q. What is the difference between a fits image and a fits cube ?

  1. A.A fits image is a 2D image which has objects positions within that image. A fits cubes is a 3D image where the first two dimension are the 2D x and y coordinates of a series of images which are storage as a series of postage stamps in the 3rd dimension. See here for more details


The Rules (see Appendix of the GREAT10 Handbook)

Q. Can I train on GREAT10 Training Data?

  1. A.Yes you are allowed to do this.


Q. When is the GREAT08 Challenge Deadline?

A. 1700 BST on the night of 2nd September 2011.


Q. What is the prize?

A. There will be two prizes, one for the winner of the main galaxy challenge and one for the most promising new method, as judged by the GREAT10 Team.


Each prize will be an expenses paid trip to the final workshop at JPL Pasadena for one member of the winning teams, plus an iPad or iPod -- or an item of similar price (winners can negotiate the exact prize but not the total amount of expenditure).


Travel expenses will be paid from anywhere in the world, but must be a standard class return ticket and reasonable accommodation and subsistence expenses. The winner of the galaxy challenge is the entry with the highest Q value on the leaderboard at the competition deadline. All information provided on the method at the competition deadline will be used in assessing the most promising new method. Members of the GREAT10 Coordination Team (Kitching, Amara, Bridle, Heymans, Gill, Massey, Shmakova, Voigt) are not eligible for prizes. The goal of the GREAT10 Challenge is to find promising new methods that will be used on upcoming cosmic shear data to find out the nature of the dark energy.


Q. Where will the workshop(s) be?

  1. A.There are 3 workshops :


  1. 26th to 28th January 2011 Edinburgh, eScience Institute, UK

  2. 3rd to 5th May 2011 London, UCL, UK

  3. 26th to 29th September 2011, JPL, Pasadena, USA (Final Workshop)


Feasibility

Q. Would it be sensible for me to recommend this problem to a PhD student who started October 2010?

A. We believe we have phrased the shear estimation problem in a concise way in the GREAT08 Challenge Handbook, that we hope will be accessible to a starting PhD student.


Q. Will GREATXX Challenges be around in 5 years time?

  1. A.We hope that the next challenge will be launched in 2012. The ultimate lensing experiments should

start in ~2020, by which point we hope that the algorithmic challenges will be solved.


Q. How do you expect it to be possible to get around the fact that we don't know what galaxies look like (e.g. the top left hand panel of Fig. 2 in the GREAT08 Challenge Handbook)?

A. There are two potential ways that we can think of to overcome limitations in our knowledge about galaxy shapes:

(i) build a method for estimating shear that is relatively insensitive to the galaxy shape

  1. (ii)use deep images (i.e. GREAT10 training data) as a prior for the majority of the images


Q. Do I need to understand all the equations in the Handbook?

A. The challenges are to measure the blurring effect in astronomical images, and to measure the shapes of galaxies (how elliptical they appear). The equations in the great10 handbook are meant as a guide only, and detailed knowledge of them is not required to tackle the basic problem.

 

This page contains further information in the form of Frequently asked questions. This will be continually updated.


FAQs

HELP! Email the helpdesk

Star.html
Star.html