Columns
Table | Type | Column | Type | Size | Nulls | Default | Comments |
---|---|---|---|---|---|---|---|
feature_cvterm_dbxref | Table | feature_cvterm_dbxref_id | bigserial | 19 | nextval('sequence.feature_cvterm_dbxref_feature_cvterm_dbxref_id_seq'::regclass) | ||
featureloc | Table | phase | int4 | 10 | √ | null | Phase of translation with respect to srcfeature_id. Values are 0, 1, 2. It may not be possible to manifest this column for some features such as exons, because the phase is dependant on the spliceform (the same exon can appear in multiple spliceforms). This column is mostly useful for predicted exons and CDSs. |
feature | Table | type_id | int8 | 19 | null | A required reference to a table:cvterm giving the feature type. This will typically be a Sequence Ontology identifier. This column is thus used to subclass the feature table. |
|
feature_dbxref | Table | dbxref_id | int8 | 19 | null | ||
feature_cvtermprop | Table | feature_cvterm_id | int8 | 19 | null | ||
feature_cvtermprop | Table | rank | int4 | 10 | 0 | Property-Value ordering. Any feature_cvterm can have multiple values for any particular property type - these are ordered in a list using rank, counting from zero. For properties that are single-valued rather than multi-valued, the default 0 value should be used. |
|
feature_relationshipprop_pub | Table | pub_id | int8 | 19 | null | ||
featureloc | Table | fmax | int8 | 19 | √ | null | The rightmost/maximal boundary in the linear range represented by the featureloc. Sometimes (e.g. in bioperl) this is called -end- although this is confusing because it does not necessarily represent the 3-prime coordinate. Important: This is space-based (interbase) coordinates, counting from zero. No conversion is required to go from fmax to the rightmost coordinate in a base-oriented system that counts from 1 (e.g. GFF, Bioperl). |
synonym | Table | type_id | int8 | 19 | null | Types would be symbol and fullname for now. |
|
synonym | Table | name | varchar | 255 | null | The synonym itself. Should be human-readable machine-searchable ascii text. |
|
feature_relationship | Table | rank | int4 | 10 | 0 | The ordering of subject features with respect to the object feature may be important (for example, exon ordering on a transcript - not always derivable if you take trans spliced genes into consideration). Rank is used to order these; starts from zero. |
|
featureloc | Table | locgroup | int4 | 10 | 0 | This is used to manifest redundant, derivable extra locations for a feature. The default locgroup=0 is used for the DIRECT location of a feature. Important: most Chado users may never use featurelocs WITH logroup > 0. Transitively derived locations are indicated with locgroup > 0. For example, the position of an exon on a BAC and in global chromosome coordinates. This column is used to differentiate these groupings of locations. The default locgroup 0 is used for the main or primary location, from which the others can be derived via coordinate transformations. Another example of redundant locations is storing ORF coordinates relative to both transcript and genome. Redundant locations open the possibility of the database getting into inconsistent states; this schema gives us the flexibility of both warehouse instantiations with redundant locations (easier for querying) and management instantiations with no redundant locations. An example of using both locgroup and rank: imagine a feature indicating a conserved region between the chromosomes of two different species. We may want to keep redundant locations on both contigs and chromosomes. We would thus have 4 locations for the single conserved region feature - two distinct locgroups (contig level and chromosome level) and two distinct ranks (for the two species). |
|
feature | Table | name | varchar | 255 | √ | null | The optional human-readable common name for a feature, for display purposes. |
feature_cvterm | Table | pub_id | int8 | 19 | null | Provenance for the annotation. Each annotation should have a single primary publication (which may be of the appropriate type for computational analyses) where more details can be found. Additional provenance dbxrefs can be attached using feature_cvterm_dbxref. |
|
feature | Table | md5checksum | bpchar | 32 | √ | null | The 32-character checksum of the sequence, calculated using the MD5 algorithm. This is practically guaranteed to be unique for any feature. This column thus acts as a unique identifier on the mathematical sequence. |
featureloc | Table | srcfeature_id | int8 | 19 | √ | null | The source feature which this location is relative to. Every location is relative to another feature (however, this column is nullable, because the srcfeature may not be known). All locations are -proper- that is, nothing should be located relative to itself. No cycles are allowed in the featureloc graph. |
feature_pubprop | Table | value | text | 2147483647 | √ | null | |
feature_relationshipprop | Table | feature_relationshipprop_id | bigserial | 19 | nextval('sequence.feature_relationshipprop_feature_relationshipprop_id_seq'::regclass) | ||
feature_cvterm | Table | is_not | bool | 1 | false | If this is set to true, then this annotation is interpreted as a NEGATIVE annotation - i.e. the feature does NOT have the specified function, process, component, part, etc. See GO docs for more details. |
|
featureprop_pub | Table | featureprop_id | int8 | 19 | null | ||
feature_cvterm_dbxref | Table | dbxref_id | int8 | 19 | null | ||
featureloc | Table | strand | int2 | 5 | √ | null | The orientation/directionality of the location. Should be 0, -1 or +1. |
feature_synonym | Table | pub_id | int8 | 19 | null | The pub_id link is for relating the usage of a given synonym to the publication in which it was used. |
|
feature_relationshipprop | Table | value | text | 2147483647 | √ | null | The value of the property, represented as text. Numeric values are converted to their text representation. This is less efficient than using native database types, but is easier to query. |
feature | Table | seqlen | int8 | 19 | √ | null | The length of the residue feature. See column:residues. This column is partially redundant with the residues column, and also with featureloc. This column is required because the location may be unknown and the residue sequence may not be manifested, yet it may be desirable to store and query the length of the feature. The seqlen should always be manifested where the length of the sequence is known. |
feature_synonym | Table | is_current | bool | 1 | false | The is_current boolean indicates whether the linked synonym is the current -official- symbol for the linked feature. |
|
feature | Table | feature_id | bigserial | 19 | nextval('sequence.feature_feature_id_seq'::regclass) | ||
feature_synonym | Table | synonym_id | int8 | 19 | null | ||
feature_synonym | Table | is_internal | bool | 1 | false | Typically a synonym exists so that somebody querying the db with an obsolete name can find the object theyre looking for (under its current name. If the synonym has been used publicly and deliberately (e.g. in a paper), it may also be listed in reports as a synonym. If the synonym was not used deliberately (e.g. there was a typo which went public), then the is_internal boolean may be set to -true- so that it is known that the synonym is -internal- and should be queryable but should not be listed in reports as a valid synonym. |
|
featureloc_pub | Table | featureloc_id | int8 | 19 | null | ||
feature_cvterm_pub | Table | pub_id | int8 | 19 | null | ||
featureloc | Table | rank | int4 | 10 | 0 | Used when a feature has >1 location, otherwise the default rank 0 is used. Some features (e.g. blast hits and HSPs) have two locations - one on the query and one on the subject. Rank is used to differentiate these. Rank=0 is always used for the query, Rank=1 for the subject. For multiple alignments, assignment of rank is arbitrary. Rank is also used for sequence_variant features, such as SNPs. Rank=0 indicates the wildtype (or baseline) feature, Rank=1 indicates the mutant (or compared) feature. |
|
feature_pubprop | Table | feature_pub_id | int8 | 19 | null | ||
featureprop | Table | rank | int4 | 10 | 0 | Property-Value ordering. Any feature can have multiple values for any particular property type - these are ordered in a list using rank, counting from zero. For properties that are single-valued rather than multi-valued, the default 0 value should be used |
|
feature_pub | Table | feature_id | int8 | 19 | null | ||
feature_synonym | Table | feature_synonym_id | bigserial | 19 | nextval('sequence.feature_synonym_feature_synonym_id_seq'::regclass) | ||
feature_relationship | Table | feature_relationship_id | bigserial | 19 | nextval('sequence.feature_relationship_feature_relationship_id_seq'::regclass) | ||
feature_contact | Table | feature_contact_id | bigserial | 19 | nextval('sequence.feature_contact_feature_contact_id_seq'::regclass) | ||
feature_relationshipprop_pub | Table | feature_relationshipprop_id | int8 | 19 | null | ||
featureprop_pub | Table | pub_id | int8 | 19 | null | ||
feature_contact | Table | contact_id | int8 | 19 | null | ||
feature_relationshipprop | Table | feature_relationship_id | int8 | 19 | null | ||
featureprop | Table | featureprop_id | bigserial | 19 | nextval('sequence.featureprop_featureprop_id_seq'::regclass) | ||
feature | Table | uniquename | text | 2147483647 | null | The unique name for a feature; may not be necessarily be particularly human-readable, although this is preferred. This name must be unique for this type of feature within this organism. |
|
feature_relationshipprop_pub | Table | feature_relationshipprop_pub_id | bigserial | 19 | nextval('sequence.feature_relationshipprop_pub_feature_relationshipprop_pub_i_seq'::regclass) | ||
feature_dbxref | Table | is_current | bool | 1 | true | True if this secondary dbxref is the most up to date accession in the corresponding db. Retired accessions should set this field to false |
|
feature_dbxref | Table | feature_dbxref_id | bigserial | 19 | nextval('sequence.feature_dbxref_feature_dbxref_id_seq'::regclass) | ||
featureprop | Table | feature_id | int8 | 19 | null | ||
feature | Table | timeaccessioned | timestamp | 29 | now() | For handling object accession or modification timestamps (as opposed to database auditing data, handled elsewhere). The expectation is that these fields would be available to software interacting with chado. |
|
feature_pubprop | Table | feature_pubprop_id | bigserial | 19 | nextval('sequence.feature_pubprop_feature_pubprop_id_seq'::regclass) | ||
feature_relationship | Table | value | text | 2147483647 | √ | null | Additional notes or comments. |
feature_cvterm_pub | Table | feature_cvterm_pub_id | bigserial | 19 | nextval('sequence.feature_cvterm_pub_feature_cvterm_pub_id_seq'::regclass) | ||
feature_relationshipprop | Table | type_id | int8 | 19 | null | The name of the property/slot is a cvterm. The meaning of the property is defined in that cvterm. Currently there is no standard ontology for feature_relationship property types. |
|
featureprop_pub | Table | featureprop_pub_id | bigserial | 19 | nextval('sequence.featureprop_pub_featureprop_pub_id_seq'::regclass) | ||
feature_cvterm_dbxref | Table | feature_cvterm_id | int8 | 19 | null | ||
featureloc | Table | is_fmax_partial | bool | 1 | false | This is typically false, but may be true if the value for column:fmax is inaccurate or the rightmost part of the range is unknown/unbounded. |
|
feature_cvterm_pub | Table | feature_cvterm_id | int8 | 19 | null | ||
feature_contact | Table | feature_id | int8 | 19 | null | ||
synonym | Table | synonym_id | bigserial | 19 | nextval('sequence.synonym_synonym_id_seq'::regclass) | ||
feature | Table | timelastmodified | timestamp | 29 | now() | For handling object accession or modification timestamps (as opposed to database auditing data, handled elsewhere). The expectation is that these fields would be available to software interacting with chado. |
|
feature_cvterm | Table | feature_cvterm_id | bigserial | 19 | nextval('sequence.feature_cvterm_feature_cvterm_id_seq'::regclass) | ||
feature_cvterm | Table | feature_id | int8 | 19 | null | ||
feature | Table | is_analysis | bool | 1 | false | Boolean indicating whether this feature is annotated or the result of an automated analysis. Analysis results also use the companalysis module. Note that the dividing line between analysis and annotation may be fuzzy, this should be determined on a per-project basis in a consistent manner. One requirement is that there should only be one non-analysis version of each wild-type gene feature in a genome, whereas the same gene feature can be predicted multiple times in different analyses. |
|
feature_pub | Table | feature_pub_id | bigserial | 19 | nextval('sequence.feature_pub_feature_pub_id_seq'::regclass) | ||
feature_pubprop | Table | type_id | int8 | 19 | null | ||
featureloc | Table | feature_id | int8 | 19 | null | The feature that is being located. Any feature can have zero or more featurelocs. |
|
feature_cvterm | Table | cvterm_id | int8 | 19 | null | ||
feature_cvtermprop | Table | type_id | int8 | 19 | null | The name of the property/slot is a cvterm. The meaning of the property is defined in that cvterm. cvterms may come from the OBO evidence code cv. |
|
featureprop | Table | type_id | int8 | 19 | null | The name of the property/slot is a cvterm. The meaning of the property is defined in that cvterm. Certain property types will only apply to certain feature types (e.g. the anticodon property will only apply to tRNA features) ; the types here come from the sequence feature property ontology. |
|
featureloc | Table | fmin | int8 | 19 | √ | null | The leftmost/minimal boundary in the linear range represented by the featureloc. Sometimes (e.g. in Bioperl) this is called -start- although this is confusing because it does not necessarily represent the 5-prime coordinate. Important: This is space-based (interbase) coordinates, counting from zero. To convert this to the leftmost position in a base-oriented system (eg GFF, Bioperl), add 1 to fmin. |
featureloc | Table | residue_info | text | 2147483647 | √ | null | Alternative residues, when these differ from feature.residues. For instance, a SNP feature located on a wild and mutant protein would have different alternative residues. for alignment/similarity features, the alternative residues is used to represent the alignment string (CIGAR format). Note on variation features; even if we do not want to instantiate a mutant chromosome/contig feature, we can still represent a SNP etc with 2 locations, one (rank 0) on the genome, the other (rank 1) would have most fields null, except for alternative residues. |
feature_relationship_pub | Table | pub_id | int8 | 19 | null | ||
synonym | Table | synonym_sgml | varchar | 255 | null | The fully specified synonym, with any non-ascii characters encoded in SGML. |
|
featureloc | Table | featureloc_id | bigserial | 19 | nextval('sequence.featureloc_featureloc_id_seq'::regclass) | ||
feature_cvtermprop | Table | value | text | 2147483647 | √ | null | The value of the property, represented as text. Numeric values are converted to their text representation. This is less efficient than using native database types, but is easier to query. |
feature_dbxref | Table | feature_id | int8 | 19 | null | ||
feature_pub | Table | pub_id | int8 | 19 | null | ||
feature_pubprop | Table | rank | int4 | 10 | 0 | ||
feature_relationship | Table | object_id | int8 | 19 | null | The object of the subj-predicate-obj sentence. This is typically the container feature. |
|
feature_relationship | Table | type_id | int8 | 19 | null | Relationship type between subject and object. This is a cvterm, typically from the OBO relationship ontology, although other relationship types are allowed. The most common relationship type is OBO_REL:part_of. Valid relationship types are constrained by the Sequence Ontology. |
|
feature_cvterm | Table | rank | int4 | 10 | 0 | ||
feature_relationshipprop | Table | rank | int4 | 10 | 0 | Property-Value ordering. Any feature_relationship can have multiple values for any particular property type - these are ordered in a list using rank, counting from zero. For properties that are single-valued rather than multi-valued, the default 0 value should be used. |
|
feature_relationship_pub | Table | feature_relationship_id | int8 | 19 | null | ||
feature_synonym | Table | feature_id | int8 | 19 | null | ||
feature_relationship_pub | Table | feature_relationship_pub_id | bigserial | 19 | nextval('sequence.feature_relationship_pub_feature_relationship_pub_id_seq'::regclass) | ||
feature_relationship | Table | subject_id | int8 | 19 | null | The subject of the subj-predicate-obj sentence. This is typically the subfeature. |
|
feature | Table | residues | text | 2147483647 | √ | null | A sequence of alphabetic characters representing biological residues (nucleic acids, amino acids). This column does not need to be manifested for all features; it is optional for features such as exons where the residues can be derived from the featureloc. It is recommended that the value for this column be manifested for features which may may non-contiguous sublocations (e.g. transcripts), since derivation at query time is non-trivial. For expressed sequence, the DNA sequence should be used rather than the RNA sequence. The default storage method for the residues column is EXTERNAL, which will store it uncompressed to make substring operations faster. |
featureprop | Table | value | text | 2147483647 | √ | null | The value of the property, represented as text. Numeric values are converted to their text representation. This is less efficient than using native database types, but is easier to query. |
featureloc_pub | Table | featureloc_pub_id | bigserial | 19 | nextval('sequence.featureloc_pub_featureloc_pub_id_seq'::regclass) | ||
feature | Table | is_obsolete | bool | 1 | false | Boolean indicating whether this feature has been obsoleted. Some chado instances may choose to simply remove the feature altogether, others may choose to keep an obsolete row in the table. |
|
feature | Table | organism_id | int8 | 19 | null | The organism to which this feature belongs. This column is mandatory. |
|
featureloc | Table | is_fmin_partial | bool | 1 | false | This is typically false, but may be true if the value for column:fmin is inaccurate or the leftmost part of the range is unknown/unbounded. |
|
feature_cvtermprop | Table | feature_cvtermprop_id | bigserial | 19 | nextval('sequence.feature_cvtermprop_feature_cvtermprop_id_seq'::regclass) | ||
feature | Table | dbxref_id | int8 | 19 | √ | null | An optional primary public stable identifier for this feature. Secondary identifiers and external dbxrefs go in the table feature_dbxref. |
featureloc_pub | Table | pub_id | int8 | 19 | null |