CO2A1_XENLA
ID CO2A1_XENLA Reviewed; 1486 AA.
AC Q91717; Q7ZTI6;
DT 01-MAY-2007, integrated into UniProtKB/Swiss-Prot.
DT 01-MAY-2007, sequence version 2.
DT 03-AUG-2022, entry version 94.
DE RecName: Full=Collagen alpha-1(II) chain {ECO:0000250|UniProtKB:P02458};
DE AltName: Full=Alpha-1 type II collagen {ECO:0000250|UniProtKB:P02458};
DE Flags: Precursor;
GN Name=col2a1 {ECO:0000250|UniProtKB:P02458};
OS Xenopus laevis (African clawed frog).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Amphibia;
OC Batrachia; Anura; Pipoidea; Pipidae; Xenopodinae; Xenopus; Xenopus.
OX NCBI_TaxID=8355;
RN [1]
RP NUCLEOTIDE SEQUENCE [MRNA], AND DEVELOPMENTAL STAGE.
RX PubMed=1918153; DOI=10.1083/jcb.115.2.565;
RA Su M.W., Suzuki H.R., Bieker J.J., Solursh M., Ramirez F.;
RT "Expression of two nonallelic type II procollagen genes during Xenopus
RT laevis embryogenesis is characterized by stage-specific production of
RT alternatively spliced transcripts.";
RL J. Cell Biol. 115:565-575(1991).
RN [2]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA].
RC TISSUE=Embryo;
RG NIH - Xenopus Gene Collection (XGC) project;
RL Submitted (MAR-2003) to the EMBL/GenBank/DDBJ databases.
CC -!- FUNCTION: Type II collagen is specific for cartilaginous tissues. It is
CC essential for the normal embryonic development of the skeleton, for
CC linear growth and for the ability of cartilage to resist compressive
CC forces (By similarity). {ECO:0000250}.
CC -!- SUBUNIT: Homotrimers of alpha 1(II) chains. {ECO:0000250}.
CC -!- SUBCELLULAR LOCATION: Secreted, extracellular space, extracellular
CC matrix {ECO:0000255|PROSITE-ProRule:PRU00793}.
CC -!- DEVELOPMENTAL STAGE: Initially, the transcripts are localized to
CC notochord, somites, and the dorsal region of the lateral plate
CC mesoderm. At later stages of development and parallel to increased mRNA
CC accumulation, collagen expression becomes progressively more confined
CC to chondrogenic regions of the tadpole. {ECO:0000269|PubMed:1918153}.
CC -!- DOMAIN: The C-terminal propeptide, also known as COLFI domain, have
CC crucial roles in tissue growth and repair by controlling both the
CC intracellular assembly of procollagen molecules and the extracellular
CC assembly of collagen fibrils. It binds a calcium ion which is essential
CC for its function (By similarity). {ECO:0000250}.
CC -!- PTM: Contains mostly 4-hydroxyproline. Prolines at the third position
CC of the tripeptide repeating unit (G-X-P) are 4-hydroxylated in some or
CC all of the chains. {ECO:0000250|UniProtKB:P05539}.
CC -!- PTM: Contains 3-hydroxyproline at a few sites. This modification occurs
CC on the first proline residue in the sequence motif Gly-Pro-Hyp, where
CC Hyp is 4-hydroxyproline. {ECO:0000250|UniProtKB:P05539}.
CC -!- PTM: Lysine residues at the third position of the tripeptide repeating
CC unit (G-X-Y) are 5-hydroxylated in some or all of the chains.
CC {ECO:0000250|UniProtKB:P05539}.
CC -!- PTM: O-glycosylated on hydroxylated lysine residues. The O-linked
CC glycan consists of a Glc-Gal disaccharide.
CC {ECO:0000250|UniProtKB:P05539}.
CC -!- SIMILARITY: Belongs to the fibrillar collagen family.
CC {ECO:0000255|PROSITE-ProRule:PRU00793}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; M63595; AAA49678.1; -; mRNA.
DR EMBL; BC048221; AAH48221.1; -; mRNA.
DR EMBL; BC111515; AAI11516.1; -; mRNA.
DR PIR; A40333; A40333.
DR PIR; B40333; B40333.
DR RefSeq; NP_001081258.1; NM_001087789.1.
DR AlphaFoldDB; Q91717; -.
DR SMR; Q91717; -.
DR PRIDE; Q91717; -.
DR DNASU; 397738; -.
DR GeneID; 397738; -.
DR KEGG; xla:397738; -.
DR CTD; 397738; -.
DR Xenbase; XB-GENE-6252613; col2a1.L.
DR OrthoDB; 337699at2759; -.
DR Proteomes; UP000186698; Chromosome 2L.
DR Bgee; 397738; Expressed in internal ear and 7 other tissues.
DR GO; GO:0005581; C:collagen trimer; IEA:UniProtKB-KW.
DR GO; GO:0005576; C:extracellular region; IEA:UniProtKB-KW.
DR GO; GO:0005201; F:extracellular matrix structural constituent; IEA:InterPro.
DR GO; GO:0046872; F:metal ion binding; IEA:UniProtKB-KW.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR000885; Fib_collagen_C.
DR InterPro; IPR001007; VWF_dom.
DR Pfam; PF01410; COLFI; 1.
DR Pfam; PF01391; Collagen; 8.
DR Pfam; PF00093; VWC; 1.
DR SMART; SM00038; COLFI; 1.
DR SMART; SM00214; VWC; 1.
DR PROSITE; PS51461; NC1_FIB; 1.
DR PROSITE; PS01208; VWFC_1; 1.
DR PROSITE; PS50184; VWFC_2; 1.
PE 2: Evidence at transcript level;
KW Calcium; Collagen; Disulfide bond; Extracellular matrix; Glycoprotein;
KW Hydroxylation; Metal-binding; Reference proteome; Repeat; Secreted; Signal.
FT SIGNAL 1..26
FT /evidence="ECO:0000255"
FT PROPEP 27..183
FT /note="N-terminal propeptide"
FT /evidence="ECO:0000250"
FT /id="PRO_0000286178"
FT CHAIN 184..1243
FT /note="Collagen alpha-1(II) chain"
FT /id="PRO_0000286179"
FT PROPEP 1244..1486
FT /note="C-terminal propeptide"
FT /evidence="ECO:0000250"
FT /id="PRO_0000286180"
FT DOMAIN 36..94
FT /note="VWFC"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00220"
FT DOMAIN 1252..1486
FT /note="Fibrillar collagen NC1"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00793"
FT REGION 100..1241
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 203..1216
FT /note="Triple-helical region"
FT REGION 1217..1243
FT /note="Nonhelical region (C-terminal)"
FT COMPBIAS 138..154
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 353..367
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 434..448
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1202..1219
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT BINDING 1300
FT /ligand="Ca(2+)"
FT /ligand_id="ChEBI:CHEBI:29108"
FT /evidence="ECO:0000250"
FT BINDING 1302
FT /ligand="Ca(2+)"
FT /ligand_id="ChEBI:CHEBI:29108"
FT /evidence="ECO:0000250"
FT BINDING 1303
FT /ligand="Ca(2+)"
FT /ligand_id="ChEBI:CHEBI:29108"
FT /evidence="ECO:0000250"
FT BINDING 1305
FT /ligand="Ca(2+)"
FT /ligand_id="ChEBI:CHEBI:29108"
FT /evidence="ECO:0000250"
FT BINDING 1308
FT /ligand="Ca(2+)"
FT /ligand_id="ChEBI:CHEBI:29108"
FT /evidence="ECO:0000250"
FT SITE 183..184
FT /note="Cleavage; by procollagen N-endopeptidase"
FT /evidence="ECO:0000250"
FT SITE 1243..1244
FT /note="Cleavage; by procollagen C-endopeptidase"
FT /evidence="ECO:0000250"
FT MOD_RES 661
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P05539"
FT MOD_RES 670
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P05539"
FT MOD_RES 672
FT /note="3-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P05539"
FT MOD_RES 673
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P05539"
FT MOD_RES 676
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P05539"
FT MOD_RES 910
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P05539"
FT MOD_RES 916
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P05539"
FT MOD_RES 922
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P05539"
FT MOD_RES 1146
FT /note="3-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P05539"
FT MOD_RES 1188
FT /note="3-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P05539"
FT MOD_RES 1189
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P05539"
FT MOD_RES 1203
FT /note="3-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P05539"
FT MOD_RES 1204
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P05539"
FT MOD_RES 1207
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P05539"
FT MOD_RES 1209
FT /note="3-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P05539"
FT MOD_RES 1210
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P05539"
FT MOD_RES 1213
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P05539"
FT MOD_RES 1215
FT /note="3-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P05539"
FT MOD_RES 1216
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P05539"
FT CARBOHYD 1387
FT /note="N-linked (GlcNAc...) asparagine"
FT /evidence="ECO:0000255"
FT DISULFID 1282..1314
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00793"
FT DISULFID 1288
FT /note="Interchain (with C-1305)"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00793"
FT DISULFID 1305
FT /note="Interchain (with C-1288)"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00793"
FT DISULFID 1322..1484
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00793"
FT DISULFID 1392..1437
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00793"
FT CONFLICT 456
FT /note="Q -> E (in Ref. 1; AAA49678)"
FT /evidence="ECO:0000305"
FT CONFLICT 1287
FT /note="L -> I (in Ref. 1; AAA49678)"
FT /evidence="ECO:0000305"
FT CONFLICT 1315
FT /note="D -> N (in Ref. 1; AAA49678)"
FT /evidence="ECO:0000305"
SQ SEQUENCE 1486 AA; 142263 MW; 02C18E5F5807100E CRC64;
MFSFVDSRTL VLFAATQVIL LAVVRCQDEE DVLDTGSCVQ HGQRYSDKDV WKPEPCQICV
CDTGTVLCDD IICEESKDCP NAEIPFGECC PICPTEQSST SSGQGVLKGQ KGEPGDIKDV
LGPRGPPGPQ GPSGEQGSRG ERGDKGEKGA PGPRGRDGEP GTPGNPGPVG PPGPPGLGGN
FAAQMTGGFD EKAGGAQMGV MQGPMGPMGP RGPPGPTGAP GPQGFQGNPG EPGEPGAGGP
MGPRGPPGPS GKPGDDGEAG KPGKSGERGP PGPQGARGFP GTPGLPGVKG HRGYPGLDGA
KGEAGAAGAK GEGGATGEAG SPGPMGPRGL PGERGRPGSS GAAGARGNDG LPGPAGPPGP
VGPAGAPGFP GAPGSKGEAG PTGARGPEGA QGPRGESGTP GSPGPAGASG NPGTDGIPGA
KGSSGGPGIA GAPGFPGPRG PPGPQGATGP LGPKGQTGDP GVAGFKGEQG PKGEIGSAGP
QGAPGPAGEE GKRGARGEPG AAGPNGPPGE RGAPGNRGFP GQDGLAGPKG APGERGVPGL
GGPKGGNGDP GRPGEPGLPG ARGLTGRPGD AGPQGKVGPS GASGEDGRPG PPGPQGARGQ
PGVMGFPGPK GANGEPGKAG EKGLVGAPGL RGLPGKDGET GSQGPNGPAG PAGERGEQGP
PGPSGFQGLP GPPGSPGEGG KPGDQGVPGE AGAPGLVGPR GERGFPGERG SSGPQGLQGP
RGLPGTPGTD GPKGASGPSG PNGAQGPPGL QGMPGERGAA GISGPKGDRG DTGEKGPEGA
SGKDGSRGLT GPIGPPGPAG PNGEKGESGP SGPPGIVGAR GAPGDRGENG PPGPAGFAGP
PGADGQSGLK GDQGESGQKG DAGAPGPQGP SGAPGPQGPT GVFGPKGARG AQGPAGATGF
PGAAGRVGTP GPNGNPGPPG PPGSAGKEGP KGVRGDAGPP GRAGDPGLQG AAGAPGEKGE
PGEDGPSGPD GPPGPQGLSG QRGIVGLPGQ RGERGFPGLP GPSGEPGKQG GPGSSGDRGP
PGPVGPPGLT GPSGEPGREG NPGSDGPPGR DGATGIKGDR GETGPLGAPG APGAPGAPGS
VGPTGKQGDR GESGPQGPLG PSGPAGARGL AGPQGPRGDK GEAGEAGERG QKGHRGFTGL
QGLPGPPGSA GDQGATGPAG PAGPRGPPGP VGPSGKDGSN GISGPIGPPG PRGRSGETGP
SGPPGQPGPP GPPGPPGPGI DMSAFAGLSQ PEKGPDPMRY MRADQASNSL PVDVEATLKS
LNNQIENIRS PDGTKKNPAR TCRDLKLCHP EWKSGDYWID PNQGCTVDAI KVFCDMETGE
TCVYPNPSKI PKKNWWSAKG KEKKHIWFGE TINGGFQFSY GDDSSAPNTA NIQMTFLRLL
STDASQNITY HCKNSIAFMD EASGNLKKAV LLQGSNDVEI RAEGNSRFTY NALEDGCKKH
TGKWSKTVIE YRTQKTSRLP IVDIAPMDIG GADQEFGVDI GPVCFL