Data Format

Overall Structure of Data Files

GEIA Inventories include a 10 line header followed by the data. Documentation is provided in separate files. The file structures and formats are set up for ease of reading and viewing by a wide variety of users.

For hints on checking to see that you downloaded your data correctly, visit the data checking page.

Tools:

Download the fortran program readGeia.f, or view the program. It contains a program that will read the data into arrays. It can be used as a starting point to put the data into a format needed.

The data files are compressed using the pkzip utilities. If the user does not have the software to de-compress the files it can be downloaded as shareware from the pk-zip site.

GEIA Headers

There are 10 lines of header information.
line 1:
15 GEIA label
15 filename
10 file creation date
line 2:
10  specie
10 reference year for the data
10 resolution of the data (monthly, seasonal or annual)
20 units (metric units chosen to best capture the range of values for the chemical) 
2 number of levels
line 3: Documentation name and location plus contact persons' complete address
lines 4-10: Data range and additional clarifying information. Format for the data.

GEIA label:

This is a general identifier common to all GEIA data bases. The label identifies each file as a GEIA file and list the type of file. The labels are as follows: "GEIA Inventory" Data for one of the species "GEIA Document" Documention for one of the inventories "GEIA Auxiliary" Additional information (e.g., population files).

File names:

The file name includes the following: specie, reference year for the data, temporal period, number of levels, and the inventory version. See the description below. 

Example:  for filename: SO285sn1.1a = SO2, seasonal resolution, one level, version 1a Species: SO2, NOX, VOC, CO, ... Year: 85, 90, ... Temporal resolution of the data: by month, season, or annually denoted as mn, sn, yr Inventory Version: 1a, 1b,....., 2a, 2b,..... The corresponding header for this inventory would be:

12345678901234567890123456789012345678901234567890123456789 (character # not on header)
GEIA Inventory SO285sn1.1a    28 Jun 94 
SO2       1985      seasonal  tons/yr             1
Documentation in Hopkins et al. Documentation Name Here
Contact: D. Hopkins, Science & Policy, 2385 Panorama Ave 
Boulder Co 80304  
FAX: 303-442-6958  Phone: (303) 442-6866  E-mail: hopkins@rmii.com

Values: minimum: .6 maximum: 59.9528 sum: 4834.03 million tonnes C

Data Structure

Each line of data consists of an integer grid number and real specie values for the temporal resolution of the data, for each level with the format (I6,1x,12(E10.6,1x)). The E10.6 can be varied slightly to E10.x where x is the preferred decimal level for the inventory being presented. The specific metric units used are chosen to best accommodate the range of values. The format is chosen for ease of reading and viewing information by a wide variety of users. Given the wide variation of emissions magnitudes for some chemicals, the E format is adopted.

Given the variability in number of levels and temporal resolution for different inventories, the following guidelines are presented. If the specie data is provided with yearly resolution at one level only, then there is only one data point per line. If the species data is provided with seasonal or monthly resolution for one level only, then each line has 4 or 12 data points per line, depending on whether the information is available seasonally or monthly. The data in this case is listed with January first or winter first, again depending on whether the resolution is monthly or seasonal. If the species data is provided with annual resolution and multiple levels, then each data line has information for each of the levels going from 1 to the last level. If there are multiple levels and temporal resolution by month or season, then information for each level appears first with each data line containing values for each month or season. For example, for an inventory with two levels and seasonal resolution, then the data is presented for level one first with the information for each season appearing on each data line associated with each grid. These guidelines are different for the aircraft data. This information will be treated as a special case.

Each data file contains at most (360x180 data lines) with the grid number starting at lower left. For the inventories where there are more than one levels and the temporal resolution is seasonal or monthly, then each data file contains at most (360x180) * #levels data lines. Any grid with zeros for the emissions is not included in the data file.

Grid number is defined as follows:
Grid Number = (j*1000) + i
j   row number starting at 1 for 90S to 89S latitude, 
    to 180 for 89N to 90N latitude

i   column number starting at 1 for 180W to 179W longitude, 
    to 360 for 179E to 180E longitude. 

The coordinates represent the center of the grid.

The latitude and longitude of the center of the grid is given by:
    latitude = (j-91) + 0.5
    longitude = (i-181) + 0.5

Documentation on each inventory is prepared by the inventory developers. Notes on documentation can be found in the header information, and in "docs" directory when it is available.

The following is a short fortran program that will read in the data and put it into arrays.
       program readgeia

c 
c      ... program to read the geia data
c NOTE! this program will not compile until you change the format statement
c       #104!! read the data header to see what the appropriate
c       value is! if not noted on header use E10.6
c      ... this program is intended as a starting point from which
c      ...   the user modifies to do what they want
c      ...  GEIA CONTACTS:
c   
c   GEIA publications and general information can be obtained from:
c      GEIA Data Management and Information Exchange Center
c      Henry Lansford,  ACT Communication Manager
c      c/o Science & Policy Associates, Inc. 
c      2385 Panorama Ave. 
c      Boulder,Colorado 80304, USA   
c
c      Phone 303-442-6866, fax: 303-442-6958. 
c      Internet: henryl@rmii.com 
c    
c      To get on the mailing list please send email to the above address.
c
c   Questions about data access, or programming issues can be directed to:
c      Debra Hopkins
c      Phone 303-442-6866
c      Email: hopkins@rmii.com
c
c   Scientific Questions or Adminstrative Questions can be directed to:
c      Paulette Middleton
c      Phone 303-442-6866
c      Email: Paulette@rmii.com
c
c
c   For questions about individual data sets, please contact the
c      person listed in the header of that data. 
c
c      ... Modification History:
c            NAME    DATE      DESCRIPTION                         
c          D.Hopkins  7/94    wrote original 
c          D.Hopkins  1/95    update and min max 
c          D.Hopkins  1/95    made error on compile so user must put in format
c          D.Hopkins  3/95    correction to min/max if statemens 
c          D.Hopkins  3/16/95 corrected max calc. for equal data
c          D.Hopkins  3/16/95 correction to read prblms with 1 level,1time 
c
c       INPUT: 1.)"readGeia.in" which contains 2 lines
c               the first the name of the geia data set file
c               the second line with an output file name
c              2.) any geia data set named in the readGeia.in file
c       OUTPUT: summary data in a file named in the second line of
c               the readGeia.in file above
c               contains total, average, min and max
c
c      *************************************************************************
c      ... program control declarations
       integer maxLevels, maxTime, maxValues
       parameter (maxLevels=20, maxTime=12, nummax=5, 
     1            maxValues=
        maxTime*maxLevels) integer
       inFile, outFile data inFile, outFile
       /10, 12/ character*40
       datafilename, outfilename
       character*20 city logical done,
pointOut,
tilleof      c c ...
       header declarations character*10 createDate, specie,
       refYear, time character*15
       geiaLabel, filename
       character*20 units integer numLevels,igrid,
jgrid,
ilevel      c c ...
       values declarations real lat, long, average, 
     amnt(maxLevels,      maxTime),
     1      total(maxLevels,maxTime), 2
       max(maxLevels,maxTime,nummax), min(maxLevels,maxTime) 
     integer         maxi(maxLevels,maxTime,nummax), 
     1         maxj(maxLevels,maxTime,nummax), 2     
mini(maxLevels,maxTime),
minj(maxLevels,maxTime)      c
c      ************************************************************************* c ... open input file with input and output
       file namesopen(inFile,file='readGeia.in',status='old',form=
        'formatted') read (inFile,
       100) datafilename read (inFile,
       100) outfilename
       close (inFile) print
       *,' ' print *,'READING
       FROM: ',datafilename print *,'WRITING
TO:    ',outfilename 100
format
(A40)      c c ... open geia data file, and
       output fileopen(inFile,file=datafilename,status='old',form=
        'formatted')open(outFile,file=outfilename,status='new',form='formatted')
c
c      ... read header data
       read (inFile, 101) geiaLabel, filename, createDate
       read (inFile, 102) specie, refYear, time, units, numLevels
       read (inFile, 103) 
101    format (A15,A15,A10)
102    format (A10,A10,A10,A20,I2)
103    format (///////)
c      ... set number of times monthly:12 seasonal:4 annual:1
       if (time .EQ. "monthly") numTimes = 12
       if (time .EQ. "seasonal") numTimes = 4
       if (time .EQ. "annual") numTimes = 1
c
c      ... and print out some values to assure us that things are roughly correct!
       write(outfile,*) ' '
       write(outfile,*),
     1   '------------------------------------------------------------'
       write(outfile,*),'READING: ',geiaLabel,' FILE: ',filename
       write(outfile,*),'SPECIE: ', specie, ' YEAR: ',refYear, 
     1                  ' UNITS: ',units
       write(outfile,*),'   THIS IS ',time,' DATA     with ',numTimes, 
     1                 ' data times'
       write(outfile,*),'   FILE HAS ',numLevels,' LEVEL(S)'
       write(outfile,*),
     1   '------------------------------------------------------------'
       write(outfile,*),' '
       write(outfile,*) 'DATA AT SELECTED POINTS'
       write(outfile,*) '  LA, PARIS, HONG KONG, ATLANTIC & PACIFIC'
       write(outfile,*) '  if no value at points then nothing printed'
c
c      ... initialize min max arrays
       do itime=1, numTimes
          do ilevel = 1, numLevels
             total(ilevel,itime) = 0.
             min(ilevel,itime) = 0.
             do imax=1, nummax 
                max(ilevel,itime,nummax) = 0.
             end do
          end do
       end do
c
c      ... read in data
       tilleof = .false.
       do while (.NOT. tilleof)
c
c         ... read in all levels and times for one grid
c         ... DH 3/16/95 added parenth. as had prblms with 1time,1levl
          read (inFile, 104, end= 1000) (jgrid, 
     igrid,       1((amnt(ilevel,itime),itime=1,numTimes),ilevel=1,numLevels)) 
c         ... compute lat and long for this grid
          lat = (float(jgrid)-91.) + 0.5
          long = (float(igrid)-181.) + 0.5
c
c         ... output data at selected points
c         ... (note: lat lon at center of grid xx.5
c         ...        N latitudes positive, S latitudes negative
c         ...        E longitudes positive, W longitudes negative
          pointOut = .TRUE.
          if ( (lat .eq. 34.5) .and. (long .eq. -118.5) ) then 
              city = 'LOS ANGELES'
          else if ( (lat .eq. 21.5) .and. (long .eq. 115.5) ) then
              city = 'HONG KONG'
          else if ( (lat .eq. 48.5) .and. (long .eq. 2.5) ) then
              city = 'PARIS'
          else if ( ( lat .eq. 30.5) .and. (long .eq. -45.5) ) then
              city = 'ATLANTIC OCEAN'
          else if ( (lat .eq. 45.5) .and. (long .eq. -165.5) ) then
              city = 'PACIFIC OCEAN'
          else
             pointOut = .FALSE.
          end if
          if (pointOut) then
             write(outfile,110), city, lat, long, jgrid, igrid
             write(outfile,106)
110          format 
     1          (/,A20," lat:",F7.2,"   lon:",F7.2,3X,"grid:",1X,2(I4))
106          format (7X,'LEVEL,'2X,'TIME',1X,'CONCENTRATION')
          end if
c
c         ... compute min, max, total, and average values for each level and hour
          do itime = 1, numTimes
             do ilevel = 1, numLevels
c               ... output selected points 
                if (pointOut) 
     1              write(outfile,105),ilevel,itime,amnt(ilevel,itime)
105             format (4X,I5,2X,I5,2X,F10.2)
c         
                total(ilevel,itime) = total(ilevel,itime) + 
     1                                amnt(ilevel,itime)
                if ( ( (amnt(ilevel,itime) .LT. min(ilevel,itime)) .AND.
     1               (amnt(ilevel,itime) .GT. 0.0) )
     2              .OR. (min(ilevel,itime) .LE. 0.0) ) then
                   min(ilevel,itime) = amnt(ilevel,itime)
                   minj(ilevel,itime) = jgrid
                   mini(ilevel,itime) = igrid
                end if
C               ... compute five maximum values.. this would be easier in C!
C               ...    we will just brute force it
                done = .false.
                iimax = 1 
c               ... dh 3/16/95 corrected to LE had prblms if equal
                if ((amnt(ilevel,itime) .LE. max(ilevel,itime,nummax))
     1             .OR. (amnt(ilevel,itime).LE.0.0)) done = .TRUE.      
                do while (.NOT. done)
                   if ( amnt(ilevel,itime) 
     1                  .GT. max(ilevel,itime,iimax)) then
                      do imax =  nummax, iimax+1, -1
                         max(ilevel,itime,imax) = 
     1                                    max(ilevel,itime,imax-1)
                         maxj(ilevel,itime,imax) = 
     1                                    maxj(ilevel,itime,imax-1)
                         maxi(ilevel,itime,imax) = 
     1                                    maxi(ilevel,itime,imax-1)
                      end do
                      max(ilevel,itime,iimax) = amnt(ilevel,itime)
                      maxj(ilevel,itime,iimax) = jgrid
                      maxi(ilevel,itime,iimax) = igrid
                      done = .TRUE.
                   else
                      iimax = iimax + 1 
                   end if
                end do
             end do
          end do
       end do
c      ... NOTE: delete one of these format statements,
c      ...       read the header of the data file to choose the appropriate one
c      ...     Default is E10.6
c104    format (2(I3),1X,240(E10.6,1X))
c104    format (2(I3),1X,240(E12.6,1X))

1000   continue
c
c      ... print out some values to assure us that things are roughly correct!
       write(outfile,*),
     1   '----------------------------------------------------------'
       do itime = 1, numTimes
          write(outfile,*) ' '
          write(outfile,*),'TIME: ',itime
          do ilevel = 1, numLevels
c            ... average is total divided by total number of grids
             average = total(ilevel,itime)/(360.*180.)
             write(outfile,*),'  LEVEL: ',ilevel
             write(outfile,*),'   TOTAL: ',total(ilevel,itime),

     1                '      AVERAGE OVER ALL GRIDS: ',average
c
             lat = (float(minj(ilevel,itime))-91.)+0.5
             long = (float(mini(ilevel,itime))-181.)+0.5            
             write(outfile,111),min(ilevel,itime),lat, long, 
     1              minj(ilevel,itime),mini(ilevel,itime)
111          format(4X,'MIN VAL: ',E10.4,' lat:',f7.2,
     1          '   lon:',F7.2,'   grid(lat,long):',1x,2I4) 
c
             do imax = 1, nummax 
                lat = (float(maxj(ilevel,itime,imax))-91.)+0.5
                long = (float(maxi(ilevel,itime,imax))-181.)+0.5
                write(outfile,112),max(ilevel,itime,imax), lat, long, 
     1             maxj(ilevel,itime,imax),maxi(ilevel,itime,imax)
112             format(4X,'MAX VAL: ',E10.4,' lat:',F7.2,
     1          '   lon:',F7.2,'   grid(lat,long):',1X,2I4)
             end do
          end do
       end do
       stop
       end

file readGeia.in
nvoc90mn1.1a
nvoc90mn1.stats


(last modified 02/28/05)