Making a Metadata sheet... beyond just the Table of Contents

This is a draft of a metadata table that can be included in excel workbooks. An example with my data is on the TR server in the <All> folder. This should be the first sheet in the workbook, and the table of data to which this refers should be the ONLY other sheet in the workbook. Moreover, that sheet needs to be quite specific for the kind of data. Survey data, for example, should not be in the same sheet as manipulative experiment data. My reason for saying this is that you need to be able to do all the stats or figure generation or analysis of the data without complicated row exclusions. If the metadata sheet is substantially the same for the two cases, just copy it between workbooks. If you are unsure of what the best idea would be, I will be happy to discuss it with you.

You should also not include data analysis in this. Remember, this is only the archive of the data; it is not your working notebook.

A caveat... though this is a good start and will never be wasted effort, there may well be additional suggestions or changes as we figure out just how to get these archives generally available... i.e. how to use Morpho or other metadata programs, and how to include recognized standard identification of species. I will take the responsibility for as much as I can by way of assisting you.

So... the first sheet of the workbook should follow this model. Each row is a row in the sheet.



Investigator Cheeseman This should be the person responsible for the data.
Date When the observations were made Period covered by the data
Location Twin Cays, Belize
Longitude boundary - west -88.109 These lines can all be copied to (or from) the MDTOC and are used in the actual metadata file, too. Note... the accepted standard, not just mine, is DD.DDDDD format, with west longitude and south latitude as negative numbers. By putting longitude first, data can also be plotted, even within excel, but especially in EASy and other GIS applications.
Longitude boundary - east -88.098 ditto
Latitude boundary - south 16.823 ditto
Latitude boundary -north 16.836 ditto
Number of observations 285 Only if your data end up in a delimited text file can metadata programs count the observations. It is still good, from the standpoint of checking the data sets, to have this here.
Number of variables 27 How many columns are there in the data set itself?
Missing data code Blank If you use a particular value when a point is missing, what is it? Since it can be zero, it is important to make it clear when zeros are actual numbers, not missing values.
Narrative summary [For example] This file contains net assimilation (photosynthesis) and stomatal conductance data for Rhizophora mangle leaves. Data were taken at ambient temperature. Irradiance was controlled at or above light saturation. The set includes some data at elevated CO2.
Archive contact This could be you, me, or anyone else who is handling the archive for you.
Archive email address
Archive postal address
DATA ACQUISITION
INSTRUMENTATION This section tells you and users what tools you used in getting this data. In some cases, the parts of the instrument are separate and important, like the illuminator head for my IRGA. The software version might be important in the future, e.g. if it turns out that one versionhad a bad glitch.
Instrument
Particulars of important parts
Software and version
Reference to protocol or methods
Fixed or constant parameters This could be several lines of settings that are the same for all measurements. It could even help you set up the next time.
QC/QA [For example] Data have been edited to include only those points which are considered reliable... all set parameters as expected, Ci and conductance positive and within expected ranges, and first observations after the leaf was put into the chamber. Stability of assimilation and conductance were monitored using the graphical display, and data were logged only when both were stable. Not all data are good data... some are out of range, some were recorded before the signal was actually stable, etc. It is each person's responsibility to make sure that crap is excluded from the sets. A brief statement here says how it was done, and at least serves to assure the user that it has been done.
DATA DETAILS
COLUMN HEADING Each line of the data table contains certain elements. The first line of the table can be the headings, and metadata programs can import these along with the data. Make sure there are exactly the same number of column headings as you specified in "Number of Variables"
Obs Integer Even if you don't think you use this kind of stuff in your own analyses, PLEASE include things like this, and PLEASE try to do it the same way in all your data files. Since much of it can be copied and pasted between rows, your investment is very little, and the pay-off to the project and to biocomplexity research generally may be big.

These columns tell you about the measurement, where it was taken, parameters needed to group it for statistical purposes, data needed to allow comparisons with other measurements of the same site, or even plant or leaf. The variable "type" is actually needed by and used in metadata programs.

It is important, also, to specify units of measurment here... but NOT in the column headings (since the metadata programs won't be able to find them there anyhow, and they can make the table quite messy). See above for a note about Longitude and Latitude data.

Date Date
Site Text
Grid location Text
Longitude Real
Latitude Real
Zone Text - include additional lines to explain all the codes you use.
Treatment Text - include additional lines to explain all the codes you use.
Exposure Text - include additional lines to explain all the codes you use.
Tag Text - I use this to record any tags that may be on trees I measure.
Leaf Text - include additional lines to explain all the codes you use.
PARi Real umol quanta/m2/s Irradiance measured with sensor inside leaf chamber
CO2R Real umol/mol CO2 partial pressure in reference stream entering leaf chamber
etc. type of variable units of the measurement explanation