CO1A2_ACRYE
ID CO1A2_ACRYE Reviewed; 949 AA.
AC C0HLH2;
DT 13-NOV-2019, integrated into UniProtKB/Swiss-Prot.
DT 13-NOV-2019, sequence version 1.
DT 25-MAY-2022, entry version 5.
DE RecName: Full=Collagen alpha-2(I) chain {ECO:0000303|PubMed:31171860};
DE AltName: Full=Alpha-2 type I collagen {ECO:0000250|UniProtKB:P08123};
DE Flags: Fragments;
OS Acratocnus ye (Hispaniolan ground sloth).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Xenarthra; Pilosa; Folivora; Megalonychidae; Acratocnus.
OX NCBI_TaxID=2546656 {ECO:0000303|PubMed:31171860};
RN [1] {ECO:0000305}
RP PROTEIN SEQUENCE, TISSUE SPECIFICITY, AND IDENTIFICATION BY MASS
RP SPECTROMETRY.
RC TISSUE=Bone {ECO:0000303|PubMed:31171860};
RX PubMed=31171860; DOI=10.1038/s41559-019-0909-z;
RA Presslee S., Slater G.J., Pujos F., Forasiepi A.M., Fischer R., Molloy K.,
RA Mackie M., Olsen J.V., Kramarz A., Taglioretti M., Scaglia F., Lezcano M.,
RA Lanata J.L., Southon J., Feranec R., Bloch J., Hajduk A., Martin F.M.,
RA Salas Gismondi R., Reguero M., de Muizon C., Greenwood A., Chait B.T.,
RA Penkman K., Collins M., MacPhee R.D.E.;
RT "Palaeoproteomics resolves sloth relationships.";
RL Nat. Ecol. Evol. 3:1121-1130(2019).
CC -!- FUNCTION: Type I collagen is a member of group I collagen (fibrillar
CC forming collagen). {ECO:0000305}.
CC -!- SUBUNIT: Trimers of one alpha 2(I) and two alpha 1(I) chains.
CC {ECO:0000305}.
CC -!- SUBCELLULAR LOCATION: Secreted. Secreted, extracellular space.
CC Secreted, extracellular space, extracellular matrix {ECO:0000305}.
CC -!- TISSUE SPECIFICITY: Expressed in bones. {ECO:0000269|PubMed:31171860}.
CC -!- PTM: Prolines at the third position of the tripeptide repeating unit
CC (G-X-Y) are hydroxylated in some or all of the chains.
CC {ECO:0000250|UniProtKB:P08123}.
CC -!- MISCELLANEOUS: These protein fragments were extracted from an ancient
CC mandible bone collected in Haiti. {ECO:0000269|PubMed:31171860}.
CC -!- SIMILARITY: Belongs to the fibrillar collagen family. {ECO:0000305}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR AlphaFoldDB; C0HLH2; -.
DR GO; GO:0005615; C:extracellular space; IEA:UniProtKB-SubCell.
DR InterPro; IPR008160; Collagen.
DR Pfam; PF01391; Collagen; 5.
PE 1: Evidence at protein level;
KW Direct protein sequencing; Extinct organism protein; Extracellular matrix;
KW Glycoprotein; Hydroxylation; Secreted.
FT CHAIN 1..949
FT /note="Collagen alpha-2(I) chain"
FT /id="PRO_0000448453"
FT REGION 1..949
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 156..170
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT MOD_RES 10
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P08123"
FT MOD_RES 13
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P08123"
FT MOD_RES 34
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P08123"
FT MOD_RES 40
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P08123"
FT MOD_RES 95
FT /note="5-hydroxylysine; alternate"
FT /evidence="ECO:0000250|UniProtKB:P08123"
FT MOD_RES 348
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P08123"
FT MOD_RES 351
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P08123"
FT CARBOHYD 95
FT /note="O-linked (Gal...) hydroxylysine; alternate"
FT /evidence="ECO:0000250|UniProtKB:P08123"
FT UNSURE 9
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 20
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 27
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 91
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 103
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 106
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 127
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 175
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 195
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 213
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 222
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 231
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 252
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 305
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 314
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 326
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 353
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 359
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 377
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 433
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 454
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 477
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 537
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 558
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 714
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 761
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 762
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 768
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 770
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 779
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 792
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 823
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 849
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 852
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 858
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 861
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 864
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 17..18
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 74..75
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 107..108
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 141..142
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 294..295
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 413..414
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 471..472
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 726..727
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 802..803
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 891..892
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_TER 1
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_TER 949
FT /evidence="ECO:0000303|PubMed:31171860"
SQ SEQUENCE 949 AA; 85141 MW; 7917E221ED88803B CRC64;
SGGFDFSFLP QPPQEKAVGL GPGPMGLMGP RGPPGASGAP GPQGFQGPAG EPGEPGQTGP
AGARGPAGPP GKAGGVVGPQ GARGFPGTPG LPGFKGIRGH NGLDGLKGEP GAPGENGTPG
QTGARGLPGE RGRVGAPGPA GRGSDGSVGP VGPAGPIGSA GPPGFPGAPG PKGELGPVGN
TGPAGPAGPR GEQGLPGVSG PVGPPGNPGA NGLTGAKGAA GLPGVAGAPG LPGPRGIPGP
VGASGATGAR GLVGEPGPAG SKGESGGKGE PGSAGPQGPP GSSGEEGKRG PNGEGSTGPT
GPPGLRGGPG SRGLPGADGR AGVIGLAGAR GASGPAGVRG PSGDTGRPGE PGLMGARGLP
GSPGNVGPAG KEGPVGLPGI DGRPGPIGPA GARGEAGNIG FPGPKGPAGD PGKGNRGAPG
PDGNNGAQGP PGLQGVQGGK GEQGPAGPPG FQGLPGPAGT TGEAGKPGER GPGEFGLPGP
AGPRGERGPP GESGAVGPSG AIGSRGPSGP PGPDGNKGEP GVVGAPGTAG PAGSGGLPGE
RGAAGIPGGK GEKGETGLRG EVGTTGRDGA RGAPGAVGAP GPAGATGDRG EAGAAGPAGP
AGPRGSPGER GEVGPAGPNG FAGPAGAAGQ PGAKGERGTK GPKGENGIVG PTGPVGSAGP
AGPNGPAGPA GSRGDGGPPG VTGFPGAAGR TGPPGPSGIT GPPGPPGAAG KEGLRGPRGD
QGPVGRGETG AGGPPGFTGE KGPSGEPGTA GPPGTAGPQG LLGAPGILGL PGSRGERGLP
GVAGAVGEPG PLGISGPPGA RGGKHGNRGE PGPVGSVGPV GALGPRGPSG PQGIRGDKGE
PGEKGPRGLP GLKGHNGLQG LPGLAGQHGD QGSPGPVGPA GPRGPAGPSG PPGKDGRTGH
PGAVGPAGIR GSQGSQGPSG PAGPPGPPGP PGASGGGYDF GYEGDFYRA