chado Module

Columns

Table	Type	Column	Type	Size	Nulls	Default	Comments
feature_cvterm_dbxref	Table	feature_cvterm_dbxref_id	bigserial	19		nextval('sequence.feature_cvterm_dbxref_feature_cvterm_dbxref_id_seq'::regclass)
featureloc	Table	phase	int4	10	√	null	Phase of translation with respect to srcfeature_id. Values are 0, 1, 2. It may not be possible to manifest this column for some features such as exons, because the phase is dependant on the spliceform (the same exon can appear in multiple spliceforms). This column is mostly useful for predicted exons and CDSs.
feature	Table	type_id	int8	19		null	A required reference to a table:cvterm giving the feature type. This will typically be a Sequence Ontology identifier. This column is thus used to subclass the feature table.
feature_dbxref	Table	dbxref_id	int8	19		null
feature_cvtermprop	Table	feature_cvterm_id	int8	19		null
feature_cvtermprop	Table	rank	int4	10		0	Property-Value ordering. Any feature_cvterm can have multiple values for any particular property type - these are ordered in a list using rank, counting from zero. For properties that are single-valued rather than multi-valued, the default 0 value should be used.
feature_relationshipprop_pub	Table	pub_id	int8	19		null
featureloc	Table	fmax	int8	19	√	null	The rightmost/maximal boundary in the linear range represented by the featureloc. Sometimes (e.g. in bioperl) this is called -end- although this is confusing because it does not necessarily represent the 3-prime coordinate. Important: This is space-based (interbase) coordinates, counting from zero. No conversion is required to go from fmax to the rightmost coordinate in a base-oriented system that counts from 1 (e.g. GFF, Bioperl).
synonym	Table	type_id	int8	19		null	Types would be symbol and fullname for now.
synonym	Table	name	varchar	255		null	The synonym itself. Should be human-readable machine-searchable ascii text.
feature_relationship	Table	rank	int4	10		0	The ordering of subject features with respect to the object feature may be important (for example, exon ordering on a transcript - not always derivable if you take trans spliced genes into consideration). Rank is used to order these; starts from zero.
featureloc	Table	locgroup	int4	10		0	This is used to manifest redundant, derivable extra locations for a feature. The default locgroup=0 is used for the DIRECT location of a feature. Important: most Chado users may never use featurelocs WITH logroup > 0. Transitively derived locations are indicated with locgroup > 0. For example, the position of an exon on a BAC and in global chromosome coordinates. This column is used to differentiate these groupings of locations. The default locgroup 0 is used for the main or primary location, from which the others can be derived via coordinate transformations. Another example of redundant locations is storing ORF coordinates relative to both transcript and genome. Redundant locations open the possibility of the database getting into inconsistent states; this schema gives us the flexibility of both warehouse instantiations with redundant locations (easier for querying) and management instantiations with no redundant locations. An example of using both locgroup and rank: imagine a feature indicating a conserved region between the chromosomes of two different species. We may want to keep redundant locations on both contigs and chromosomes. We would thus have 4 locations for the single conserved region feature - two distinct locgroups (contig level and chromosome level) and two distinct ranks (for the two species).
feature	Table	name	varchar	255	√	null	The optional human-readable common name for a feature, for display purposes.
feature_cvterm	Table	pub_id	int8	19		null	Provenance for the annotation. Each annotation should have a single primary publication (which may be of the appropriate type for computational analyses) where more details can be found. Additional provenance dbxrefs can be attached using feature_cvterm_dbxref.
feature	Table	md5checksum	bpchar	32	√	null	The 32-character checksum of the sequence, calculated using the MD5 algorithm. This is practically guaranteed to be unique for any feature. This column thus acts as a unique identifier on the mathematical sequence.
featureloc	Table	srcfeature_id	int8	19	√	null	The source feature which this location is relative to. Every location is relative to another feature (however, this column is nullable, because the srcfeature may not be known). All locations are -proper- that is, nothing should be located relative to itself. No cycles are allowed in the featureloc graph.
feature_pubprop	Table	value	text	2147483647	√	null
feature_relationshipprop	Table	feature_relationshipprop_id	bigserial	19		nextval('sequence.feature_relationshipprop_feature_relationshipprop_id_seq'::regclass)
feature_cvterm	Table	is_not	bool	1		false	If this is set to true, then this annotation is interpreted as a NEGATIVE annotation - i.e. the feature does NOT have the specified function, process, component, part, etc. See GO docs for more details.
featureprop_pub	Table	featureprop_id	int8	19		null
feature_cvterm_dbxref	Table	dbxref_id	int8	19		null
featureloc	Table	strand	int2	5	√	null	The orientation/directionality of the location. Should be 0, -1 or +1.
feature_synonym	Table	pub_id	int8	19		null	The pub_id link is for relating the usage of a given synonym to the publication in which it was used.
feature_relationshipprop	Table	value	text	2147483647	√	null	The value of the property, represented as text. Numeric values are converted to their text representation. This is less efficient than using native database types, but is easier to query.
feature	Table	seqlen	int8	19	√	null	The length of the residue feature. See column:residues. This column is partially redundant with the residues column, and also with featureloc. This column is required because the location may be unknown and the residue sequence may not be manifested, yet it may be desirable to store and query the length of the feature. The seqlen should always be manifested where the length of the sequence is known.
feature_synonym	Table	is_current	bool	1		false	The is_current boolean indicates whether the linked synonym is the current -official- symbol for the linked feature.
feature	Table	feature_id	bigserial	19		nextval('sequence.feature_feature_id_seq'::regclass)
feature_synonym	Table	synonym_id	int8	19		null
feature_synonym	Table	is_internal	bool	1		false	Typically a synonym exists so that somebody querying the db with an obsolete name can find the object theyre looking for (under its current name. If the synonym has been used publicly and deliberately (e.g. in a paper), it may also be listed in reports as a synonym. If the synonym was not used deliberately (e.g. there was a typo which went public), then the is_internal boolean may be set to -true- so that it is known that the synonym is -internal- and should be queryable but should not be listed in reports as a valid synonym.
featureloc_pub	Table	featureloc_id	int8	19		null
feature_cvterm_pub	Table	pub_id	int8	19		null
featureloc	Table	rank	int4	10		0	Used when a feature has >1 location, otherwise the default rank 0 is used. Some features (e.g. blast hits and HSPs) have two locations - one on the query and one on the subject. Rank is used to differentiate these. Rank=0 is always used for the query, Rank=1 for the subject. For multiple alignments, assignment of rank is arbitrary. Rank is also used for sequence_variant features, such as SNPs. Rank=0 indicates the wildtype (or baseline) feature, Rank=1 indicates the mutant (or compared) feature.
feature_pubprop	Table	feature_pub_id	int8	19		null
featureprop	Table	rank	int4	10		0	Property-Value ordering. Any feature can have multiple values for any particular property type - these are ordered in a list using rank, counting from zero. For properties that are single-valued rather than multi-valued, the default 0 value should be used
feature_pub	Table	feature_id	int8	19		null
feature_synonym	Table	feature_synonym_id	bigserial	19		nextval('sequence.feature_synonym_feature_synonym_id_seq'::regclass)
feature_relationship	Table	feature_relationship_id	bigserial	19		nextval('sequence.feature_relationship_feature_relationship_id_seq'::regclass)
feature_contact	Table	feature_contact_id	bigserial	19		nextval('sequence.feature_contact_feature_contact_id_seq'::regclass)
feature_relationshipprop_pub	Table	feature_relationshipprop_id	int8	19		null
featureprop_pub	Table	pub_id	int8	19		null
feature_contact	Table	contact_id	int8	19		null
feature_relationshipprop	Table	feature_relationship_id	int8	19		null
featureprop	Table	featureprop_id	bigserial	19		nextval('sequence.featureprop_featureprop_id_seq'::regclass)
feature	Table	uniquename	text	2147483647		null	The unique name for a feature; may not be necessarily be particularly human-readable, although this is preferred. This name must be unique for this type of feature within this organism.
feature_relationshipprop_pub	Table	feature_relationshipprop_pub_id	bigserial	19		nextval('sequence.feature_relationshipprop_pub_feature_relationshipprop_pub_i_seq'::regclass)
feature_dbxref	Table	is_current	bool	1		true	True if this secondary dbxref is the most up to date accession in the corresponding db. Retired accessions should set this field to false
feature_dbxref	Table	feature_dbxref_id	bigserial	19		nextval('sequence.feature_dbxref_feature_dbxref_id_seq'::regclass)
featureprop	Table	feature_id	int8	19		null
feature	Table	timeaccessioned	timestamp	29		now()	For handling object accession or modification timestamps (as opposed to database auditing data, handled elsewhere). The expectation is that these fields would be available to software interacting with chado.
feature_pubprop	Table	feature_pubprop_id	bigserial	19		nextval('sequence.feature_pubprop_feature_pubprop_id_seq'::regclass)
feature_relationship	Table	value	text	2147483647	√	null	Additional notes or comments.
feature_cvterm_pub	Table	feature_cvterm_pub_id	bigserial	19		nextval('sequence.feature_cvterm_pub_feature_cvterm_pub_id_seq'::regclass)
feature_relationshipprop	Table	type_id	int8	19		null	The name of the property/slot is a cvterm. The meaning of the property is defined in that cvterm. Currently there is no standard ontology for feature_relationship property types.
featureprop_pub	Table	featureprop_pub_id	bigserial	19		nextval('sequence.featureprop_pub_featureprop_pub_id_seq'::regclass)
feature_cvterm_dbxref	Table	feature_cvterm_id	int8	19		null
featureloc	Table	is_fmax_partial	bool	1		false	This is typically false, but may be true if the value for column:fmax is inaccurate or the rightmost part of the range is unknown/unbounded.
feature_cvterm_pub	Table	feature_cvterm_id	int8	19		null
feature_contact	Table	feature_id	int8	19		null
synonym	Table	synonym_id	bigserial	19		nextval('sequence.synonym_synonym_id_seq'::regclass)
feature	Table	timelastmodified	timestamp	29		now()	For handling object accession or modification timestamps (as opposed to database auditing data, handled elsewhere). The expectation is that these fields would be available to software interacting with chado.
feature_cvterm	Table	feature_cvterm_id	bigserial	19		nextval('sequence.feature_cvterm_feature_cvterm_id_seq'::regclass)
feature_cvterm	Table	feature_id	int8	19		null
feature	Table	is_analysis	bool	1		false	Boolean indicating whether this feature is annotated or the result of an automated analysis. Analysis results also use the companalysis module. Note that the dividing line between analysis and annotation may be fuzzy, this should be determined on a per-project basis in a consistent manner. One requirement is that there should only be one non-analysis version of each wild-type gene feature in a genome, whereas the same gene feature can be predicted multiple times in different analyses.
feature_pub	Table	feature_pub_id	bigserial	19		nextval('sequence.feature_pub_feature_pub_id_seq'::regclass)
feature_pubprop	Table	type_id	int8	19		null
featureloc	Table	feature_id	int8	19		null	The feature that is being located. Any feature can have zero or more featurelocs.
feature_cvterm	Table	cvterm_id	int8	19		null
feature_cvtermprop	Table	type_id	int8	19		null	The name of the property/slot is a cvterm. The meaning of the property is defined in that cvterm. cvterms may come from the OBO evidence code cv.
featureprop	Table	type_id	int8	19		null	The name of the property/slot is a cvterm. The meaning of the property is defined in that cvterm. Certain property types will only apply to certain feature types (e.g. the anticodon property will only apply to tRNA features) ; the types here come from the sequence feature property ontology.
featureloc	Table	fmin	int8	19	√	null	The leftmost/minimal boundary in the linear range represented by the featureloc. Sometimes (e.g. in Bioperl) this is called -start- although this is confusing because it does not necessarily represent the 5-prime coordinate. Important: This is space-based (interbase) coordinates, counting from zero. To convert this to the leftmost position in a base-oriented system (eg GFF, Bioperl), add 1 to fmin.
featureloc	Table	residue_info	text	2147483647	√	null	Alternative residues, when these differ from feature.residues. For instance, a SNP feature located on a wild and mutant protein would have different alternative residues. for alignment/similarity features, the alternative residues is used to represent the alignment string (CIGAR format). Note on variation features; even if we do not want to instantiate a mutant chromosome/contig feature, we can still represent a SNP etc with 2 locations, one (rank 0) on the genome, the other (rank 1) would have most fields null, except for alternative residues.
feature_relationship_pub	Table	pub_id	int8	19		null
synonym	Table	synonym_sgml	varchar	255		null	The fully specified synonym, with any non-ascii characters encoded in SGML.
featureloc	Table	featureloc_id	bigserial	19		nextval('sequence.featureloc_featureloc_id_seq'::regclass)
feature_cvtermprop	Table	value	text	2147483647	√	null	The value of the property, represented as text. Numeric values are converted to their text representation. This is less efficient than using native database types, but is easier to query.
feature_dbxref	Table	feature_id	int8	19		null
feature_pub	Table	pub_id	int8	19		null
feature_pubprop	Table	rank	int4	10		0
feature_relationship	Table	object_id	int8	19		null	The object of the subj-predicate-obj sentence. This is typically the container feature.
feature_relationship	Table	type_id	int8	19		null	Relationship type between subject and object. This is a cvterm, typically from the OBO relationship ontology, although other relationship types are allowed. The most common relationship type is OBO_REL:part_of. Valid relationship types are constrained by the Sequence Ontology.
feature_cvterm	Table	rank	int4	10		0
feature_relationshipprop	Table	rank	int4	10		0	Property-Value ordering. Any feature_relationship can have multiple values for any particular property type - these are ordered in a list using rank, counting from zero. For properties that are single-valued rather than multi-valued, the default 0 value should be used.
feature_relationship_pub	Table	feature_relationship_id	int8	19		null
feature_synonym	Table	feature_id	int8	19		null
feature_relationship_pub	Table	feature_relationship_pub_id	bigserial	19		nextval('sequence.feature_relationship_pub_feature_relationship_pub_id_seq'::regclass)
feature_relationship	Table	subject_id	int8	19		null	The subject of the subj-predicate-obj sentence. This is typically the subfeature.
feature	Table	residues	text	2147483647	√	null	A sequence of alphabetic characters representing biological residues (nucleic acids, amino acids). This column does not need to be manifested for all features; it is optional for features such as exons where the residues can be derived from the featureloc. It is recommended that the value for this column be manifested for features which may may non-contiguous sublocations (e.g. transcripts), since derivation at query time is non-trivial. For expressed sequence, the DNA sequence should be used rather than the RNA sequence. The default storage method for the residues column is EXTERNAL, which will store it uncompressed to make substring operations faster.
featureprop	Table	value	text	2147483647	√	null	The value of the property, represented as text. Numeric values are converted to their text representation. This is less efficient than using native database types, but is easier to query.
featureloc_pub	Table	featureloc_pub_id	bigserial	19		nextval('sequence.featureloc_pub_featureloc_pub_id_seq'::regclass)
feature	Table	is_obsolete	bool	1		false	Boolean indicating whether this feature has been obsoleted. Some chado instances may choose to simply remove the feature altogether, others may choose to keep an obsolete row in the table.
feature	Table	organism_id	int8	19		null	The organism to which this feature belongs. This column is mandatory.
featureloc	Table	is_fmin_partial	bool	1		false	This is typically false, but may be true if the value for column:fmin is inaccurate or the leftmost part of the range is unknown/unbounded.
feature_cvtermprop	Table	feature_cvtermprop_id	bigserial	19		nextval('sequence.feature_cvtermprop_feature_cvtermprop_id_seq'::regclass)
feature	Table	dbxref_id	int8	19	√	null	An optional primary public stable identifier for this feature. Secondary identifiers and external dbxrefs go in the table feature_dbxref.
featureloc_pub	Table	pub_id	int8	19		null