RADIOSONDE DATA SPECIFICATION DOCUMENT -------------------------------------- Henrik Vedel, DMI The Danish Meteorological Institute, DMI, is providing radiosonde data for the project. This file gives a description of the format of the data files, the way in which the specific humidity is deduced from the radiosonde data, and of how to obtain the data. Region covered: --------------- The region from which the radiosonde data are extracted is defined by, -30.0 <= longitude <= 40.0 degrees, 25.0 <= latitude <= 89.9 degrees. "Meteorological" Radiosonde data ------------------------------ The radiosonde data which we obtain at the DMI from the GTS network, via which meteorological data are distributed, contain geopotential height, temperature, dewpoint temperature, wind speed and wind direction at selected pressure levels. The levels are a combination of standard pressure levels and extra levels at which the atmospheric profile was found to change behavior during each particular radiosonde ascent. Of these data we extract pressure, geopotential height, temperature, and dewpoint temperature for the project. Further the specific humidity is calculated and included in the extracted data files, as it is the specific humidity which enters the calculation of ztd. About heights ------------- For the purpose of comparing radiosonde data with GPS zenith tropospheric delay, it is necessary to corretct for the offset in height between the GPS site and the radiosonde site and to carry out a vertical integration. If doing so, note that the heights provided in the radiosonde records are geopotential heights, which differs from geometry heights. The reference surface of reference for the heights is the geoid. The document http://www.dmi.dk/f+u/publikation/vidrap/Sr00-04.pdf provides further details on the height issue. About calculation of relative humidity and specific humidity based on radiosonde report data. -------------------------------- For the purpose of calculating a value of the zenith tropospheric delay (ZTD) from the radiosonde data, the specific humidity (q) is required. However, in the WMO GTS Network radiosonde data, the only information provided for deriving this is the dewpoint temperature. The radiosondes DO NOT measure dewpoint temperature (T_dew), but rather relative humidity (RH). The ground station equipment converts the measured relative humidity to dewpoint temperatures before transmitting the radiosonde report via the GTS network. There can be a systematic bias associated with this conversion if converting back to RH using another formula. The bias between using a physically 'precise' formula and the formula actually used by the typical radiosonde ground equibment we have found to be of the order 2.3 mm of ZTD. To circumvent these problems we carry out the following steps: 1) Convert back from T_dew to relative humidity RH by using the inverse of the formula used by the most common radiosonde ground equipment (we do not know what is used at each site). 2) Convert from RH to specific humidity q, using also pressure and temperature from the radiosonde report, and a precise formula for the water vapor saturation pressure. Below is a more detailed account of the equations used for the radiosonde dataset and some background information on the problem. Some formulas and definitions -------------------------------- The definition of relative humidity is H = e/esat(T), (1) where e is the actual water vapor pressure and esat(T) is the saturation water vapor pressure at temperature T. The radiosonde measures H and T. The calculation of esat given the temperature is described below. The dewpoint temperature is defined as the temperature at which the water vapor saturation pressure equals the actual water vapor pressure, in other words esat(T_dew) = e = H*esat(T). (2) The specific humidity, which is the quantity entering the nwp models as well as the ztd calculation is defined as, q = density_water_vapor/total_density_air = eps*e/(p-e*(1-eps)), (3) where eps is the ratio of molecular weight of water to that of dry air, and p is the total pressure (measured by the radiosonde is this case). Deducing q it is tempting to take e = esat(T_dew) using the most precise formulae for the function esat(T) available, but unfortunately we don't know which expression for esat(T) was used in the original conversion from the measured (H,T) -> T_dew to get the reported dewpoint temperature, so that approach might create a bias. The Esat function ----------------- Esat itself can be found by solving the so-called Classius-Clapeyron equation, which is complicated and involves various assumptions. Various analytical approximations to esat exist, of different complexity and precision. The one used internally in the Hirlam model is rather precise (compared against lab. data and more elaborate analytical expressions), having the form, esat_hirlam = R2*exp(R3*(T-T0)/(T-R4)), (4) R2= 610.78 Pa, T0=273.16 K, R3_ice=21.875 K, R3_liquid=17.269 K, R4_ice=7.66 K, R4_liquid=35.86 K, with the 'ice' constants being used below -15 Celsius and the 'liquid' constants above freezing, and a gradual change, linear in T, from the ice constants to the liquid constants over the -15 to 0 Celsius interval. This being an ad hoc way to mimic the existence of a three phase medium in that temperature interval. Conversion from relative humidity H to dewpoint temperature T_dew -------------------------------- For carrying out the conversion from H to T_dew, an approximate function can be derived for inverting the esat(T) relation. It is unclear for what historical reason the measured quantity, the relative humidity, is not broadcast along with the the dewpoint temperature. There is unfortunately no meteorological standard for the formula to use for the conversion of H to T_dew (Johm Elms, UK Met Office, personal com. via the WMO). Different radiosonde ground stations may use different formulas! During meteorological data assimilation the dewpoint temperatures are converted back to relative humidity or to specific humidity using each meteorological office's standard formula. This mismatch is not a big concern for weather models, as the errors introduced are small compared to other uncertainties, both it the models and the radiosonde measurement of relative humidity (which is good to about 1-2%). But it may create a systematic bias in comparisons with independent datasets where the measurement accuracy may be more precise, for example in comparisons with GPS ztd's, when data for longer periods are treated statistically. The Digicora approximate H => T_dew relation -------------------------------- The Digicora radiosonde ground equipment uses the following formula for the conversion between H and T_dew: T_dew= T*2*K/ {T*ln(100/H%)+2*K}, with (5) K= 15*ln(100/H%) -2.0*(T-273.15)+ 2711.5 Temperatures are in K, while H% underlines that in the above version of the Digicora formula H should be specified in %. This is NOT the most correct formula to use. Their point of view is that because of the error on the H measurement is of the order of 1-2%, (referring probably to absolute error) a high precision is not necessary for the conversion. This argument fails when it comes to statistical consideration of large portions of data. About half the stations are estimated to use the Digicora formula (John Elms, UK Met Office, personal com. via the WMO), simply because Digicora and similar equipment is the most widespread. Probably none uses a version involving both the liquid and solid phase of water as the internal hirlam routine. Bias introduced through the approximate Esat function -------------------------------- We can calculate the effect of using the more approximate Digicora relation compared to using the more precise HIRLAM esat relation in the following manner. - convert the T_dew back to H by inverting the Digicora relation, (5) - calculate e = H*esat(T) using the precise HIRLAM esat relation, (4) - calculate q(p,e) and integrate to get ZTD This ztd_digicora estimate has essentially removed the effect of the imprecise H=> T_dew conversion. For comparison, we can carry out - calculate e = esat(T_dew) using the precise HIRLAM esat relation - calculate q(p,e) and integrate to get ZTD This ztd_HIRLAM estimate has incorrectly assumed the digicora relation is equivalent to the more precise HIRLAM relation. Using the radiosonde profiles we have been extracting for the MAGIC project (period 199811-200005) I have determined the offset in ztd from using one rather than the other of the Digicora and Hirlam conversion relations. I find, = 2.28+/-1.02 mm N=107520 <_station> = 2.31+/-0.38 mm N=135 A small, but clearly significant bias. (To be clear: This just compares radiosonde profiles against themselves using ztd as a measure and using different formulas for the Td->H conversion. There are no hirlam model runs and no gps data involved at all.) Based on this comparison, we have decided that it is necessary to remove the effect of the imprecise initial H => T_dew conversion when providing specific humidities for the MAGIC radiosonde dataset. Undoing the initial imprecise H=> T_dew conversion -------------------------------- As stated above, each meteorological office has a method for converting the radiosonde dewpoint temperatures back to relative humidities. A third relation for esat, provided below, is the one used currently in the Hirvda assimilation package being developed for Hirlam (the conversion routine comes from Ecmwf I believe), esat_hirvda = R2*exp(R3*(T-T0)/(T-R4)), (6) R2=611.21 Pa, R3=17.502 K, R4=32.19 K, T0=273.16 K This equation provides a means for converting back to relative humidity that is equivalent to inverting the imprecise Digicora formula. The difference in ztd's based on the Digicora and Hirvda routines is very small, of the order 0.01mm, with a standard deviation of the order 0.1mm, demonstrating that the Hirvda routine successfully removes the bias introduced by the imprecise Digicora formula. Offsets in relative humidity between the Digicora and the Hirvda routine do exist, but they are much smaller than between Hirlam and Digicora and appear at temperatures at which the humidity is very low. The use of esat_hirvda over the Digicora relation is preferred however, as not all radiosonde stations use Digicora equipment, and as a functional form for esat(T) is sometimes useful. Implementation for the dataset -------------------------------- It has been decided, therefore, to use the following approach when deriving specific humidities from radiosonde reports. We first find the best approximation to the original relative humidity measured by the radiosonde using the Hirvda formula, (6). H = esat_hirvda(T_dew)/esat_hirvda(T), (7) with T_dew & T from the report. Then we use the most precise formula for saturation pressure from the HIRLAM internal system to transform from relative humidity to specific humidity. q = eps*H*esat_hirlam(T)/(p-H*esat_hirlam(T)*(1-eps)), (8) with p & T from the report, and H from the line above. This approach is supposed to give reasonable specific humidities in an average sense. If the conversion formula used at a particular radiosonde station is known, or if the measured relative humidities are available directly, then then this results may be improved a bit. Important: the original T_dew dewpoint temperatures from the GTS network are included in the TOUGH datafiles. Warning against bad data. ------------------------- The data we extract have passed a low level quality control only. Some will still contain erroneous measurements. Whenever the number "-9999.90" appears in the data files it means the observation in question is missing or noted to be wrong. Other data can be errorneous as well, but harder to identify. Format of the radiosonde data files containing daily data: ----------------------------------------------------------- On a daily basis the DMI extracts radiosonde data. Each station report is written as a separate file, named identifier.time, where identifier is the international 5 digit identifier of the station, and time is the time of the radiosonde launch in the format yyyymmhh. The files will be in ascii format, the content structured as: Line 1: #, ccode, name Line 2: ident, lat, lon, alt, nlev, yyyy, mm, dd, hh, mm Line 3: p(1),phi(1),T(1),T_dew(1),q(1) Line 4: p(2),phi(2),T(2),T_dew(2),q(2) Line 5: . Line 6: . Line 7: . Where # = start of a new record ccode = standard letter code for countries name = name of the station (e.g. name or nearby town or locality) ident = The international identifier used for identification of meteorological stations, a five digit integer number, may include leading zeroes. lat = latitude in degrees lon = longitude in degrees alt = altitude in meters nlev = number of levels at which there are reports. yyyy = year (4 digits) mm = month (2 digits) dd = day (2 digits) hh = hour (2 digits) mm = minutes (2 digits) p = pressure [Pascal] phi = geopotential [m**2/s**2] T = temperature [K] T_dew = dewpoint temperature [K] q = specific humidity A fortran routine writing the information listed above in the format we are using is provided as a separate document. Notice: Normally the pressure levels appear in order of decreasing pressure in the reports, but sometimes this is not the case throughout the full sequence. In some cases the whole or part of the pressure sequence might be repeated, often with one type of data reported missing in the first set and not in the second. We have not corrected for such things. The number nlev simply counts the number of pressure levels for which there were reports in the original radiosonde data file = the number for which we have extracted data. Secondly, one should know that the equipment in radiosondes sometimes gives wrong results, though not obviously so. Liquid water on the instruments has a degrading effect on the measurements. Something of particular importance to the estimates of the dewpoint temperature (and thereby also the specific humidity). Having said this, radiosondes in general provide high quality measurements, and they are independent of meteorological models and their shortcomings. Archiving: ---------- The data are stored time wise as well as station wise. The time wise stored data can be found as ./met/rs/recent/yyyymm/rsyyyymmdd.tar.gz at the ftp-server, with yyyymmdd=year+month+day_of_month in a 4+2+2 digit format. Data stored station wise can be found as ./met/rs/past/yyyymm/stationid.yyyymm.gz at the ftp-server, with the station-id being the 5 digit international identifier for each station (evt. with leading 0's). See, the ./met/rs/rs.list file for a conversion of this number to names and coordinates. The station archives are made using the Unix cat command with the input file having the format specified above. Updates of the 'recent' archives are made daily with a delay of a two days relative to current time. The purpose of the delay is to ensure that almost all radiosonde data for that day has arrived at DMI. In near real time some stations would be sometimes missing. Updates of the 'past' archives are made once a month Old data may get removed from the ftp server if disc space becomes a problem. In rs.list there will be one line for each radiosonde specifying: ident, ccode, name, lat, lon, alt where, ident = 5 digit meteorological identifier ccode = country code name = name of radiosonde station lat = latitude [degrees] lon = longitude [degrees] alt = altitude [m] Note, that as radiosondes are sometimes malfunctioning or stations taken out of service, there might well be radiosonde stations listed in the file rs.list for which there are no data available for a period. Acknowledgements. Jennifer Haase and Digicora are thanked for help leading to the current treatment of humidity in the radiosonde reports extracted by DMI. c********************************************************************* c subroutine write_radiosonde(outfile,ident,year,month,day, & hour,min, & contrycode,name,lat,lon,alt,p,T,T_dew,phi,spec_hum, & nlev) C C********************************************************************** implicit none integer ident,year,month,day,hour,min,nlev,j real p(nlev),T(nlev),T_dew(nlev),phi(nlev),spec_hum(nlev), & lat,lon,alt character*3 contrycode character*15 name character*250 outfile c------------------------------------------------------------------- open (15,file=outfile,status='unknown') write(15,'(a1,a4,a15)')'#',contrycode,name write(15,'(i5,2f8.2,f7.0,i3,i6,5i3)') ident,lat,lon,alt, & nlev,year,month,day,hour,min do j=1,nlev write(15,'(f10.2,f10.2,2f9.2,e13.5))') & p(j),phi(j),T(j),T_dew(j),spec_hum(j) enddo close(15) end -----------------