OSIRIS data reduction:
by Conor Laver

Latest additions:
Stellar Extraction 07/19/2006
Pipeline on Intel Mac with Intel IDL 6.3 07/19/2006

On this page I'll detail the steps I've taken to reduce my OSIRIS datacubes from raw cubes to science quality data. The hope is that I'll save some poor soul a few hours of work here and there with these tips. I also will include a link to the page where you can get most of the *.pro files I mention here, and the general tools I've developed which are useful for data reduction.  You can either download files individually here or as a tarball (right click and save) laverlib.tar.gz . Note for my routines to work you may need some of the support routines listed here. Also throughout the instructions I'll mention things you should think of before you take your data, perhaps even before you apply for time at Keck.

Raw OSIRIS data cubes are in a folder which should have the structure /**yourdate**/SPEC/raw/ . They have the naming convention of s-year-month-day_a-dataset number-frame number.fits (e.g. s060417_a003001.fits was taken in 2006-April-17th , this file is dataset number 003 and frame number 001). OSIRIS data is taken in 'datasets', which are a group of spectral 'frames' taken immediately after one another. You do this to either immediately take a sky frame, or multiple frames over an object to mosaic later.
    This a good spot to mention what type of data you will need to successfully reduce your science data. After discussing each data type I will explain how they all tie in together to reduce your data.

USEFUL LINKS
CONTENTS:

Infrared Data Reduction Prep

       DETECTOR SATURATION: For all the following and for your data by careful not to saturate the detector or even to go in to the non linear regime. For OSIRIS this means keeping total number of counts on the spectrograph to lower than 10,000. Remember all frames you look at are showing count rates PER SECOND, and there is no warning system should you go to high. So the counts to check equals counts on the screen*integration time per coadd .
       DIFFERENTIAL ATMOSPHERIC CORRECTION (DAR): This is the name for the fact that light traveling through our atmosphere diffracts a little differentially depending on wavelength. What it means to you is that the center of your star will appear slightly shifted at one end of your datacube compared to the other. This only really noticeable at the 0.02" platescale, (I think), I've seen shifts of order 1-2 pixels at this platescale. This means that the spectrum from one pixel isn't in fact from just that pixel, it has in fact large contributions at different wavelengths from neighboring pixels. Not a problem if you are just adding up the pixels, but as this is a imaging spectrometer, you may want to get individual spectra from each pixel, so you must be aware of this. My page showing more detailed effects of this will appear here.
        SKY FRAMES: First the very minimum correction you will want is a sky frame, which is a frame taken with exactly the same settings as your data frame. The purpose is to subtract these frames in order to subtract off a good estimate of the sky emission lines. These lines are very variable on timescales of >5 minutes. So try to take a sky frame with every target image. This is one of the main considerations in keeping a 5 minute upper limit on your integration times. Of course if the region you are interested in does not contain many sky lines, this is not as big a problem.  The convention in OSIRIS is to take datasets which consist of two frames, the first is your target and the second is sky. Keeping some consistency with which frame number your sky frame is, will back reducing large amounts of data much simpler.
       TELLURIC STARS: Of course if you are just looking at blank sky for the sky emission lines, you won't get any idea of what the telluric absorption is like. Or how to correct for the instrumental response. For this you need a telluric standard star, which you will compare to a model(see below section on model stars) to find what correction to make to your target spectrum.  Most everyone uses either G2 star(solar analogs) or A stars (vega analogs). The reasoning is that G stars have well studied spectra, since we have studied our sun very well. The G star spectra has fewer large features but many more weak absorption features than the hotter A stars. A stars have the advantage that their continua are smooth and somewhat approximated by a blackbody curve (however in the IR this approximation is poor as we will see). Their large features can be fit with Lorentzians and removed (using my removeline.pro !), or more roughly just interpolated over(this is commonly done but has the side effect of not correcting any of the region covered by the line, which can be substantial in some bands). Ideally you will take your telluric star spectrum as close in possible in time and airmass as your target. In practice this is hard to do, unlike nodding off target to take a sky, it can take a lot of time to set up on another target, and so you cut your losses and take telluric stars within an hour or two of your target data. Do try to match airmass however, or take a range of airmass telluric standards and do some fancy interpolation of the telluric absorption. NOTE: For telluric stars signal to noise is very important. You want to have a higher (preferably much higher) signal to noise in your telluric star than in your target so that when you use it to correct your target spectrum you don't add too much noise. So pick bright stars (JHK mags ~6-7) and use short integration times (see note above on detector saturation) and multiple coadds. Remember signal to noise will increase by the square root of time so don't be shy about taking many coadds, if your star is bright each coadd should only be 2-10seconds, so go nuts, coadded spectra are just as easy to reduce as single spectra.
       PHOTOMETRIC STARS: If you are looking for emission lines in your target spectrum or you are interested in your continuum flux you will want to take some photometric standard stars. In the IR there are a few lists, Elias(bright), Persson(dim), and 2MASS (plentiful) to name a few. I have been overjoyed with the amount of data in the 2MASS database, chances are you've used stars that are covered by the database without even realizing it, so look them up using vizier ( http://vizier.u-strasbg.fr/viz-bin/VizieR?-source=II/246 ) and you may be pleasantly surprised. They quote very small errors on their magnitudes, and I haven't yet had the opportunity to test them. However I suspect you will be limited by the photometric conditions of the atmosphere long before you are limited by their errors. In many cases you will able to use the same star for both telluric and photometric correction, especially if you use 2MASS. So the biggest problem with photometry for me is that I observe with the OSIRIS broadband filters and the smallest platescale(0.02" pixels) so my field of view is very small. Thus I am going to be affected by the same problems as most conventional spectrometers which is light loss off the edges of my detecting area. However given that you have a good estimate of the PSF of the star from your datacube,  you can estimate roughly what percentage of my light you're losing and correct your integrated flux. NOTE: For the above reason it helps a lot to take a bit of time to center your Photometric standard star well. Also once again high signal to noise is desirable, however the integrated flux should be as susceptible to atmospheric changes (unless its really not a photometric night in which case why are you even bothering to do photometry!!!)


SUMMARY:
Type            Integration Time       Coadds      Timing      Airmass
Sky              ************Same as target*****************
Telluric        Short                         Many         **Near target**
Photometry  Short                         Many         Any         Near target

How to put it all together or 'The things you do to the data love' are:
1. Subtract Sky (part of OSIRIS pipeline)
2. Create a telluric correction spectra by dividing telluric spectrum by an appropriate stellar model (see below) and then mean normalize it (so you don't affect count numbers in your target spectrum)
3. Divide the result of 1. by the result of 2. to give you a continuum slope corrected spectra free from telluric emission and absorption lines
4. Apply the same correction to a photometric star. Estimate the flux lost of edges of your detector.
5. Compare the results of 4. to a photometrically calibrated model stellar spectrum(see below) to estimate a counts to flux ratio, correct it for the flux loss estimated in 4.
6. Apply this ratio to your corrected target spectrum to finish with a corrected photometrically calibrated spectrum ready for science.
7. Write a paper and thank UCLA for building an awesome instrument.


Ok so that's the theory and a few tips on what to expect/prepare for. Now on to how actually to do all of the above, using the all  powerful IDL language, and the great OSIRIS reduction pipeline designed by those folks down in UCLA and over in Köln, Germany.
 

PIPELINE NOTES:

    The most current version of the pipeline is installed onsite at Keck in Waimea and the majority of the toughest reduction is actually done for you by the pipeline. In fact the online reduced data is almost science quality, with the only real difference in most cases being the rectification isn't quite as accurate. It takes care of flat fielding (see note below), calibration, rectification, sky subtraction, cosmic ray removal and mosaicking(if set up properly). However as with all reduction pipelines, it has some limitations which I'll discuss now. But quickly I'll say that despite these problems it is still by far your best bet for all these steps, just be aware of the problems.

QUICK DEFINITIONS:

DRF: This is an XML file which specifies the reduction to be done by the pipeline, the DRFs used by your online reduction are available in the folder /**yourdate**/drf/ . It is also copied into the reduced datafile's header.

DDF: These are the observing settings you used, such as integration time, nod size, filters etc avalible in /**yourdate**/ddf/

QUICKLOOK: This is just one of many programs you can us to look at spectral datacubes. It was written for OSIRIS.

REDUCTION ISSUES:

RECTIFICATION: Converting a 2D data file into a 3D spectral data cube is about as hard as it seems and then some. The smart guys in UCLA have pretty much solved this problem however with only a few hiccups remaining. The idea is they have created rectification matrices which effectively map the 2D array to a 3D cube for you, the way it does it is well beyond the scope of this page, suffice it to say it is not trivial. The current rectification matrices (as of July 2006) for the Jbb and Zbb filters have small errors in them which result in a corruption of some pixels of the cube with fairly devastating effects on spectra. Until this is fixed feel free to try my jfix.pro, zfix.pro and zfix2.pro . These identify the problem  pixels which get corrupted after a normal rectification ( 50 iterations ) and fix them. I identified them by hand because not all the bad pixels are detectable using an automated routine(or at least not one smart enough that I could write fast).
 
FLAT FIELDING: So you are probably feeling nervous, that you haven't taken any flats (as you were told not to), however the official line is that the rectification matrix was generated with a flat light source and so thus actually applies a pretty good first order flat fielding for you. However it becomes quickly apparent that this isn't quite perfect and finding a good way to flat field is the current top priority project. Suffice to say that the current spectral flat fielding should be ok, and the spatial will be worked on.

COSMIC RAY CLEANING: Like any detector OSIRIS gets hit by cosmic rays, and like any cosmic ray cleaner working on low spatially sampled data, you have to be careful it doesn't think the peak of your star is a cosmic ray hit. You know it has done this if your spectrum looks horribly noisy and seems to be jumping between two intensity levels from pixel to pixel. Thanks to James Larkin, though the current cosmic ray cleaner seems to do a very good job of avoiding this by operating on the data before it is rectified.

CALIBRATION: Any scientist worth there salt won't trust the wavelength calibration done by anyone but themselves. However once again the OSIRIS team have done a great job, and with what little poking I've done so far, the wavelength calibration seems to be good to a quarter pixel or so. Of course if you want to check yourself or fine tune it, you can use a sky frame and get a line list of atmospheric emission lines and fit a few. For now I'm satisfied, in the future I may write up some OSIRIS specific code for this.

MOSAICKING: The mosaicking is done by creating a separate DRF, see examples here. It currently uses the telescope position from the headers. All the shifts are relative to the starting position and thus are equal to the shifts you entered on the mosaicking DDR. It isn't perfect, and probably needs to be improved for science quality data. The telescope positions can be a pixel or two off and for planetary bodies such as Io and Titan this results in a non perfect disk. (Note the image of Io, the discontinuity near the right limb. The bright spot is a recent eruption of Tvashtar, which is the subject of a paper in work)


STELLAR EXTRACTION: (NEW! 07/19/2006) So having investigated the effects of various methods of calculating the spectra of stars and volcanoes from OSIRIS 0.02" broadband cubes, here are my current thoughts. Differential atmospheric refraction(DAR) and regular refraction are both key players here. Two things you must realise before continuing. First, if you star is off center and you are at an airmass were you can observe a shift in central peak of your star due to DAR then naturally you could lose differentially off one edge of the detector. This I think has a relatively small effect unless you are quite close to the edge (~3-4 pixels). Second however the PSF of your star broadens with wavelength so you will lose redder light of all the edges preferentially over blue, giving a false blue slope to your continuum, if you just do simple aperture photometry. The image below shows a rough fit for a K~6.9 mag A star.
PSF broadeningPSF broad
The ultimate result is that while aperture photmetry will result in the highest signal to noise spectra, which is perfectly acceptable for investigating narrow lines, more care is needed if any continuum analysis is required. In the figure above you can see the Effect of PSF Broadening on the slope the spectra, the simple aperture photometry spectra is in black, whereas the gaussian fit to the core of the star, shows a redder slope. That this is not a DAR effect was shown by the fact that all my stars showed loss of red light despite being off center in different directions. If it were due to DAR these offsets should show reddening in one spectra and 'blueing' in the other.
To fully investigate this effect, I put together a Moffat-Gauss profile fitting routine, which should do the best job of recreating the PSF of Keck and it has the added bonus of giving me some estimate for the percentage of flux lost off the edges, to see if this flux loss is a reasonable cause for the slope change. A Moffat-Gauss profile is as it sounds a combination of the two and is required as the stellar core of the PSF is best fit with a gaussian, however the broad wings require a Moffat profile. At some point I will write a short discussion about the various profiles used, their applicability, limitations and the routines I implemented them with.
The end result was pretty good. The first problem was that due to limited field of view (64 x 19 pixels) fitting the Moffat wings in the X (19pixel) direction was shakey at best, in fact doing this supressed the X wings and led to poor flux estimates. However if we assume radial symmetry of the profile, which is fair, we can apply the fit to the Y wings(64 pixel direction) in all directions, which gives a spectral slope very close that seen above for the gaussian core fit. Due to the increased number of parameters of the fit, the computational time is greater by an order of magnitude(~10-15 minutes per cube in my Intel iMac), and the fit parameters must be carefully limited to prevent the fit mistaking local chi squared minima. As a result the signal to noise in the MoffGauss profile is slightly worse than either of the above fits, however the slope and general flux level should be much more representative of the intrinsic spectra. The latter of which can be used to adjust flux calibrations later. The model itself is useful as a diagnostic tool, for various other extraction routines and aperture simulations. Ultimately unless a better MoffGauss fit can be obtained I'm going to go with a two stream appraoch.


Below is a discussion of the simpler aperture photometry used if you are continuum normalizing your spectra and just want the highest signal to noise.

There is a stellar extraction module, however for aperture photometry I've been using a modified version of Marshall Perrin's extractstar.pro . Which essentially performs a simple extraction of the star given a user defined box size. Eventually someone may create an optimal extraction routine for stellar extraction, however given my buckets of signal, it is not my top priority. What I did do is play determine the optimum box size you should integrate over. For all my stars using broadband filters and 0.02" platescale, I found a 20 x 20 box was best. I determined this by using a couple of little tools I wrote, one is nplot.pro and its related noise.pro . These estimate the signal to noise in a spectra (nplot is interactive, noise is for use in other code).
Plot of singla to noise vs box size
As you can see the signal to noise plateaus at around a half box width of 10 pixels, giving a full integration box of 20 x 20.  Quick learners will note that this is actually wider than the area of the image for a broadband 0.02" cube, so in reality the integration width is 16 x 20. Care must be taken if you are using this for photometry to accurately correct the counts for the flux lost of the edges.

HOME REDUCTION: Currently KECK and the OSIRIS team does not support installing the OSIRIS pipeline on your own computer. Neither do I, however I can tell you are one of those people who won't take no for an answer so here are a few tips. If you have managed to sweet-talk your way into getting a version of the current pipeline, it has been successfully installed on Linux, Mac OSX(powerPC) and Mac OSX (Intel). If you are running IDL 6.3 (PowerPC) or lower then  you'll need to force a compilation of the c dynamic libraries under a power pc architecture, due to the fact that IDL currently runs through rosetta. If you have IDL 6.3 for Intel Mac then great, I've just installed (July 2006) the pipeline on that and it does work, and very fast too. You will need to recompile all the c routines, including cfitso, and create new dynamic libraries. NOTE: if any of the above does not make sense then DO NOT ATTEMPT IT, reduce your data at Keck like everyone else. I'm not a black-belt computer support guy, so if you have problems I probably can't answer them.



APPLYING YOUR KNOWLEDGE!!


Ok so if you haven't run away screaming yet and all the above has gone well then you should have some processed data cubes and be ready to start cleaning up your target data. Before proceeding you need to have processed your data with the basic pipeline which includes the following OSIRIS modules:
  • Subtract Dark Frame
  • Remove Cross Talk
  • Adjust Channel Levels
  • Glitch Identification
  • Clean Cosmic Rays
  • Spatially Rectify Spectrum
  • Assemble Data Cube
  • Save DataSet Information

You will need the following reduced ingredients to follow my recipe:
  • Target Datacubes
  • Telluric Standard Cubes
  • Photometric Stand Cubes (If you are doing photometry)
You will also need some or all of the following information:

The plan is to develop the best model you can of your telluric standard to work out the correction required to cancel out the atmosphere and instrumental spectral response. To apply this correction to all the datacubes and then estimate from the photometry star the counts to flux ratio. Sounds easy? Here we go:

STELLAR MODELS:
Ok so to work out what the atmosphere and the telescope have done to your spectra you need to know what it looked first. In fact hopefully you've studied this before you even get to the telescope to know if your science interests are gonna be screwed by atmospheric absorption or intrinsic telluric lines. If not check out this image I made which shows the atmospheric lines and a G2(Solar) spectrum convolved to what OSIRIS sees.
So in general many astronomers at this point assume their star is a blackbody at the stars effective temperature and maybe remove/interpolate over a few of the strongest features in the telluric star's intrinsic spectrum. Depending on your science goals this may be enough, great stop now and go do your science. Before you skip off like a gamboling lamb, know that if you are interested in continuum slopes or weak absorption lines this is not enough! Stars are not blackbodies in the infrared, and they certainly might not even be in the optical if there is interstellar reddening. Also if you are using G stars then they have tons of weak absorptions which will add noise to your target spectrum.
So this is the method I use, because I am looking for both small continuum slope errors and weak absorptions!
1. I take a detailed stellar atmospheres model continuum from the web pages of Professor Kurucz at Harvard, from 0.5 to 5 microns
2. I convolve a very high resolution spectra of a G star (our sun) taken from the ISAAC webpage, and convolve it down to the resolution of OSIRIS, using IDL's convol routine and a gaussian of approximate width of two resolution elements. ( I actually eventually allow this width to be a free parameter to get the best fit to my data)
3. I multiply these together to get an ideal unreddened star of my chosen spectral type.
4. I use the JHK magnitudes from 2MASS and the methods of Cohen 2003 to get a first estimate for the flux calibrated spectrum
5. I use this estimate in conjunction with the reddening model of Fitzpatrick 1999 (IDL routine fm_unred.pro) and the JHK magnitudes to fit for the best reddening parameter to match the 2MASS magnitudes (this is all done by my routine fitredden.pro )

The results in a fully flux calibrated, OSIRIS spectral resolution spectra whose JHK magnitudes match 2MASS's to within their error estimates. For now this the best model I can think of, and it works well. The reddening is essential to have a spectra which is consistent at JHK wavelengths. The final spectral parameter, the radial velocity of the actual star, I fit by simply minimizing the residuals after dividing my real spectra by this model (I wrote a little signal to noise calculator called noise.pro and an interactive wrapper nplot.pro) . Thus while more involved that a simple BB fit, we can be confident of our spectral continuum slopes and even get estimates of the color excess E(B-V), thus the extinction, Av and also the radial velocity of the star.

This model is ideal for use for both telluric correction and photometric calibration. 

USING SPECTRAL MODELS:
Once you have this you are ready to do some spectral math:
M = Flux calibrated model
N= Normalized Model (mean=1)
T= Extracted Telluric Standard Star Spectrum
P=Extracted Photometric Standard Star Spectrum (flux corrected)
SPEC= Extracted Science Target Spectrum

Telluric Correction (telcor) = T/N
Corrected Phot star (corphot) = P/telcor
Counts to Flux ratio (c2fratio) = mean( M/corphot)
Science ready flux calibrated target spectrum = (SPEC/telcor)*c2fratio


and you are done!!!!!!!
Well except now you have to do the scientific analysis of your data and that my friend is your business.............

Final note: If, like me, you might use the same star for both telluric and photometry, you can skip to
Fluxed telluric Correction (fluxtelcor)= T/M
Science ready flux calibrated target spectrum = SPEC/fluxtelcor
however the counts to flux ratio is an interesting number which you may want to compare over multiple observations or multiple stars......

Also I find it useful to use a little routine called spec2cube.pro which I wrote to make a fanned out cube of the telluric correction so I can use quicklook to divide it into any cube and see a real time telluric correction for each pixel of any target data....



Last Updated by Conor Laver on 2006 July 7th