CO1A2_NEODO
ID CO1A2_NEODO Reviewed; 979 AA.
AC C0HLH6;
DT 13-NOV-2019, integrated into UniProtKB/Swiss-Prot.
DT 13-NOV-2019, sequence version 1.
DT 25-MAY-2022, entry version 5.
DE RecName: Full=Collagen alpha-2(I) chain {ECO:0000303|PubMed:31171860};
DE AltName: Full=Alpha-2 type I collagen {ECO:0000250|UniProtKB:P08123};
DE Flags: Fragments;
OS Neocnus dousman (Slow ground sloth).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Xenarthra; Pilosa; Folivora; Megalonychidae; Neocnus.
OX NCBI_TaxID=2546657 {ECO:0000303|PubMed:31171860};
RN [1] {ECO:0000305}
RP PROTEIN SEQUENCE, TISSUE SPECIFICITY, AND IDENTIFICATION BY MASS
RP SPECTROMETRY.
RC TISSUE=Bone {ECO:0000303|PubMed:31171860};
RX PubMed=31171860; DOI=10.1038/s41559-019-0909-z;
RA Presslee S., Slater G.J., Pujos F., Forasiepi A.M., Fischer R., Molloy K.,
RA Mackie M., Olsen J.V., Kramarz A., Taglioretti M., Scaglia F., Lezcano M.,
RA Lanata J.L., Southon J., Feranec R., Bloch J., Hajduk A., Martin F.M.,
RA Salas Gismondi R., Reguero M., de Muizon C., Greenwood A., Chait B.T.,
RA Penkman K., Collins M., MacPhee R.D.E.;
RT "Palaeoproteomics resolves sloth relationships.";
RL Nat. Ecol. Evol. 3:1121-1130(2019).
CC -!- FUNCTION: Type I collagen is a member of group I collagen (fibrillar
CC forming collagen). {ECO:0000305}.
CC -!- SUBUNIT: Trimers of one alpha 2(I) and two alpha 1(I) chains.
CC {ECO:0000305}.
CC -!- SUBCELLULAR LOCATION: Secreted. Secreted, extracellular space.
CC Secreted, extracellular space, extracellular matrix {ECO:0000305}.
CC -!- TISSUE SPECIFICITY: Expressed in bones. {ECO:0000269|PubMed:31171860}.
CC -!- PTM: Prolines at the third position of the tripeptide repeating unit
CC (G-X-Y) are hydroxylated in some or all of the chains.
CC {ECO:0000250|UniProtKB:P08123}.
CC -!- MISCELLANEOUS: These protein fragments were extracted from an ancient
CC tibia bone collected in Haiti. {ECO:0000269|PubMed:31171860}.
CC -!- SIMILARITY: Belongs to the fibrillar collagen family. {ECO:0000305}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR AlphaFoldDB; C0HLH6; -.
DR GO; GO:0005615; C:extracellular space; IEA:UniProtKB-SubCell.
DR InterPro; IPR008160; Collagen.
DR Pfam; PF01391; Collagen; 12.
PE 1: Evidence at protein level;
KW Direct protein sequencing; Extinct organism protein; Extracellular matrix;
KW Glycoprotein; Hydroxylation; Secreted.
FT CHAIN 1..979
FT /note="Collagen alpha-2(I) chain"
FT /id="PRO_0000448449"
FT REGION 1..979
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 151..165
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT MOD_RES 10
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P08123"
FT MOD_RES 13
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P08123"
FT MOD_RES 35
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P08123"
FT MOD_RES 41
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P08123"
FT MOD_RES 88
FT /note="5-hydroxylysine; alternate"
FT /evidence="ECO:0000250|UniProtKB:P08123"
FT MOD_RES 341
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P08123"
FT MOD_RES 344
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P08123"
FT CARBOHYD 88
FT /note="O-linked (Gal...) hydroxylysine; alternate"
FT /evidence="ECO:0000250|UniProtKB:P08123"
FT UNSURE 9
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 21
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 28
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 84
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 92
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 95
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 121
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 189
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 207
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 215
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 224
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 245
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 299
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 308
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 346
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 352
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 370
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 411
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 432
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 453
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 477
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 557
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 707
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 755
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 756
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 762
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 764
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 773
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 786
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 802
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 854
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 880
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 883
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 889
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 892
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 895
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 17..18
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 46..47
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 62..63
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 90..91
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 115..116
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 169..170
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 209..210
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 322..323
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 384..385
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 405..406
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 536..537
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 582..583
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 788..789
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 799..800
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 952..953
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_TER 1
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_TER 979
FT /evidence="ECO:0000303|PubMed:31171860"
SQ SEQUENCE 979 AA; 87774 MW; 1CD9D2B62FD949E1 CRC64;
SGGFDFSFLP QPPQEKAGVG LGPGPMGLMG PRGPPGASGA PGPQGFGARG PAGPPGKAGE
DGRPGERGVV GPQGARGFPG TPGLPGFKGI GLDGLKGQPG APGVKGEPGA PGENGTGARG
LPGERGRVGA PGPAGARGSD GSVGPVGPAG PIGSAGPPGF PGAPGPKGEG PVGNTGPSGP
AGPRGEQGLP GVSGPVGPPG NPGANGLTGK GAAGLPGVAG APGLPGPRGI PGPVGASGAT
GARGLVGEPG PAGSKGESGG KGEPGSAGPQ GPPGSSGEEG KRGPNGEAGS TGPTGPPGLR
GGPGSRGLPG ADGRAGVIGP AGRGASGPAG VRGPSGDTGR PGEPGLMGAR GLPGSPGNVG
PAGKEGPVGL PGIDGRPGPI GPAGRGEAGN IGFPGPKGPA GDPGKKGHAG LAGNRGAPGP
DGNNGAQGPP GLQGVQGGKG EQGPAGPPGF QGLPGPAGTT GEVGKPGERG IPGEFGLPGP
AGPRGERGPP GESGAVGPSG AIGSRGPSGP PGPDGNKGEP GVVGAPGTAG PAGSGGPGER
GAAGIPGGKG EKGETGLRGE VGTTGRDGAR GAPGAVGAPG PAGEAGAAGP AGPAGPRGSP
GERGEVGPAG PNGFAGPAGA AGQPGAKGER GTKGPKGENG IVGPTGPVGS AGPAGPNGPA
GPAGSRGDGG PPGVTGFPGA AGRTGPPGPS GITGPPGPPG AAGKEGLRGP RGDQGPVGRT
GETGAGGPPG FTGEKGPSGE PGTAGPPGTA GPQGLLGAPG ILGLPGSRGE RGLPGVAGAV
GEPGPLGIGP PGARGPSGGD GLPGHKGERG YAGNAGPVGA AGAPGPHGSV GPAGKHGNRG
EPGPVGSVGP VGALGPRGPS GPQGIRGDKG EPGDKGPRGL PGLKGHNGLQ GLPGLAGQHG
DQGSPGPVGP AGPRGPAGPS GPPGKDGRTG HPGAVGPAGI RGSQGSQGPS GPGPPGPPGP
PGASGGGYDF GYEGDFYRA