Warning: This page is under construction. It will continue to change during the next few months


EMP: Enzymes and Metabolic Pathways

Status

The current volume of the EMP data includes around 2,500,000 data elements derived from 11,000 publications. Although it took over 10 years for this amount to accumulate, the encoding scheme changed very little down the way. It continues to be a mixture of formatting and tagging whose rules had never been expressed in terms of formal grammar. Until recently, visual inspection remained the only means of rule enforcement and validation.

Although the original intent was to make the data entirely machine-readable, nothing but simple keyword search techniques could match the complexity and variance in the data. The development of the parser that combines heuristics with a number of formal grammars has made it possible to recognize relations, in addition to individual facts.

Interface

The data you will be accessing from this page are organized as a number of tables stored in a SQL engine (postgreSQL). At present, there is no generic interface to these tables, other than SQL. For those not familiar with it, we will provide a number of query forms to access the most common kinds of data.

Schema

It is quite simple. Each EMP data type (not to be mixed with SQL types) is stored in a table bearing its name, one element a row. Each table has a single data attribute with the same name as the corresponding EMP type. Some numeric types also have a unit of measurement. Other attributes, common for all types, specify the location of the element in the EMP source. The only exception (so far) is the author name table that has more than one data attribute (last name, first and middle names and initials)

Use the following links to find out more about currently loaded tables (details will be added soon)

Examples:


Gene Selkov Jr.
Last modified: Fri May 1 22:33:50 MSD 1998