CO1A2_ACRSX
ID CO1A2_ACRSX Reviewed; 925 AA.
AC C0HLH4;
DT 13-NOV-2019, integrated into UniProtKB/Swiss-Prot.
DT 13-NOV-2019, sequence version 1.
DT 25-MAY-2022, entry version 5.
DE RecName: Full=Collagen alpha-2(I) chain {ECO:0000303|PubMed:31171860};
DE AltName: Full=Alpha-2 type I collagen {ECO:0000250|UniProtKB:P08123};
DE Flags: Fragments;
OS Acratocnus sp. (strain SLP-2019) (Ground sloth).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Xenarthra; Pilosa; Folivora; Megalonychidae; Acratocnus;
OC unclassified Acratocnus.
OX NCBI_TaxID=2546662 {ECO:0000303|PubMed:31171860};
RN [1] {ECO:0000305}
RP PROTEIN SEQUENCE, TISSUE SPECIFICITY, AND IDENTIFICATION BY MASS
RP SPECTROMETRY.
RC TISSUE=Bone {ECO:0000303|PubMed:31171860};
RX PubMed=31171860; DOI=10.1038/s41559-019-0909-z;
RA Presslee S., Slater G.J., Pujos F., Forasiepi A.M., Fischer R., Molloy K.,
RA Mackie M., Olsen J.V., Kramarz A., Taglioretti M., Scaglia F., Lezcano M.,
RA Lanata J.L., Southon J., Feranec R., Bloch J., Hajduk A., Martin F.M.,
RA Salas Gismondi R., Reguero M., de Muizon C., Greenwood A., Chait B.T.,
RA Penkman K., Collins M., MacPhee R.D.E.;
RT "Palaeoproteomics resolves sloth relationships.";
RL Nat. Ecol. Evol. 3:1121-1130(2019).
CC -!- FUNCTION: Type I collagen is a member of group I collagen (fibrillar
CC forming collagen). {ECO:0000305}.
CC -!- SUBUNIT: Trimers of one alpha 2(I) and two alpha 1(I) chains.
CC {ECO:0000305}.
CC -!- SUBCELLULAR LOCATION: Secreted. Secreted, extracellular space.
CC Secreted, extracellular space, extracellular matrix {ECO:0000305}.
CC -!- TISSUE SPECIFICITY: Expressed in bones. {ECO:0000269|PubMed:31171860}.
CC -!- PTM: Prolines at the third position of the tripeptide repeating unit
CC (G-X-Y) are hydroxylated in some or all of the chains.
CC {ECO:0000250|UniProtKB:P08123}.
CC -!- MISCELLANEOUS: These protein fragments were extracted from an ancient
CC mandible bone collected in Haiti. {ECO:0000269|PubMed:31171860}.
CC -!- SIMILARITY: Belongs to the fibrillar collagen family. {ECO:0000305}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR AlphaFoldDB; C0HLH4; -.
DR GO; GO:0005615; C:extracellular space; IEA:UniProtKB-SubCell.
DR InterPro; IPR008160; Collagen.
DR Pfam; PF01391; Collagen; 6.
PE 1: Evidence at protein level;
KW Direct protein sequencing; Extinct organism protein; Extracellular matrix;
KW Glycoprotein; Hydroxylation; Secreted.
FT CHAIN 1..925
FT /note="Collagen alpha-2(I) chain"
FT /id="PRO_0000448451"
FT REGION 1..925
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 165..179
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT MOD_RES 10
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P08123"
FT MOD_RES 13
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P08123"
FT MOD_RES 42
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P08123"
FT MOD_RES 48
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P08123"
FT MOD_RES 103
FT /note="5-hydroxylysine; alternate"
FT /evidence="ECO:0000250|UniProtKB:P08123"
FT MOD_RES 317
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P08123"
FT MOD_RES 332
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P08123"
FT MOD_RES 335
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P08123"
FT CARBOHYD 103
FT /note="O-linked (Gal...) hydroxylysine; alternate"
FT /evidence="ECO:0000250|UniProtKB:P08123"
FT UNSURE 9
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 28
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 35
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 99
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 111
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 114
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 144
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 184
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 204
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 222
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 230
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 239
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 260
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 307
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 316
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 337
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 343
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 360
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 400
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 421
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 442
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 466
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 526
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 547
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 743
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 744
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 750
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 752
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 761
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 774
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 804
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 830
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 834
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 837
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 840
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 24..25
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 82..83
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 150..151
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 224..225
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 297..298
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 322..323
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 358..359
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 395..396
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 700..701
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 708..709
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 776..777
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 783..784
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 832..833
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_TER 1
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_TER 925
FT /evidence="ECO:0000303|PubMed:31171860"
SQ SEQUENCE 925 AA; 83208 MW; D3A92180D3A42B83 CRC64;
SGGFDFSFLP QPPQEKAHDG GRYYGVGLGP GPMGLMGPRG PPGASGAPGP QGFQGPAGEP
GEPGQTGPAG ARGPAGPPGK AGGVVGPQGA RGFPGTPGLP GFKGIRGHNG LDGLKGQPGA
QGVKGEPGAP GENGTPGQTG ARGLPGERGR RGSDGSVGPV GPAGPIGSAG PPGFPGAPGP
KGELGPVGNT GPAGPAGPRG EQGLPGVSGP VGPPGNPGAN GLTGKGAAGL PGVAGAPGLP
GPRGIPGPVG ASGATGARGL VGEPGPAGSK GESGGKGEPG SAGPQGPPGS SGEEGKRSTG
PTGPPGLRGG PGSRGLPGAD GRRGPSGDTG RPGEPGLMGA RGLPGSPGNV GPAGKEGPGL
PGIDGRPGPI GPAGARGEAG NIGFPGPKGP AGDPGGHAGL AGNRGAPGPD GNNGAQGPPG
LQGVQGGKGE QGPAGPPGFQ GLPGPAGTTG EAGKPGERGI PGEFGLPGPA GPRGERGPPG
ESGAVGPSGA IGSRGPSGPP GPDGNKGEPG VVGAPGTAGP AGSGGLPGER GAAGIPGGKG
EKGETGLRGE VGTTGRDGAR GAPGAVGAPG PAGATGDRGE AGAAGPAGPA GPRGSPGERG
EVGPAGPNGF AGPAGAAGQP GAKGERGTKG PKGENGIVGP TGPVGSAGPA GPNGPAGPAG
SRGDGGPPGV TGFPGAAGRT GPPGPSGITG PPGPPGAAGK GDQGPVGRGE TGAGGPPGFT
GEKGPSGEPG TAGPPGTAGP QGLLGAPGIL GLPGSRGERG LPGVAGAVGE PGPLGIGPPG
ARGGKHGNRG EPGPVGSVGP VGALGPRGPS GPQGIRGDKG EPGEKGPRGL PGGLQGLPGL
AGQHGDQGSP GPVGPAGPRG PAGPSGPPGK DGRTGHPGAV GPAGIRGSQG SQGPSGPAGP
PGPPGPPGAS GGGYDFGYEG DFYRA