CO1A2_NOTSH
ID CO1A2_NOTSH Reviewed; 1008 AA.
AC C0HLJ4;
DT 13-NOV-2019, integrated into UniProtKB/Swiss-Prot.
DT 13-NOV-2019, sequence version 1.
DT 25-MAY-2022, entry version 5.
DE RecName: Full=Collagen alpha-2(I) chain {ECO:0000303|PubMed:31171860};
DE AltName: Full=Alpha-2 type I collagen {ECO:0000250|UniProtKB:P08123};
DE Flags: Fragments;
OS Nothrotheriops shastensis (Shasta ground sloth).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Xenarthra; Pilosa; Folivora; Megatheriidae; Nothrotheriops.
OX NCBI_TaxID=136416 {ECO:0000303|PubMed:31171860};
RN [1] {ECO:0000305}
RP PROTEIN SEQUENCE, TISSUE SPECIFICITY, AND IDENTIFICATION BY MASS
RP SPECTROMETRY.
RC TISSUE=Bone {ECO:0000303|PubMed:31171860};
RX PubMed=31171860; DOI=10.1038/s41559-019-0909-z;
RA Presslee S., Slater G.J., Pujos F., Forasiepi A.M., Fischer R., Molloy K.,
RA Mackie M., Olsen J.V., Kramarz A., Taglioretti M., Scaglia F., Lezcano M.,
RA Lanata J.L., Southon J., Feranec R., Bloch J., Hajduk A., Martin F.M.,
RA Salas Gismondi R., Reguero M., de Muizon C., Greenwood A., Chait B.T.,
RA Penkman K., Collins M., MacPhee R.D.E.;
RT "Palaeoproteomics resolves sloth relationships.";
RL Nat. Ecol. Evol. 3:1121-1130(2019).
CC -!- FUNCTION: Type I collagen is a member of group I collagen (fibrillar
CC forming collagen). {ECO:0000305}.
CC -!- SUBUNIT: Trimers of one alpha 2(I) and two alpha 1(I) chains.
CC {ECO:0000305}.
CC -!- SUBCELLULAR LOCATION: Secreted. Secreted, extracellular space.
CC Secreted, extracellular space, extracellular matrix {ECO:0000305}.
CC -!- TISSUE SPECIFICITY: Expressed in bones. {ECO:0000269|PubMed:31171860}.
CC -!- PTM: Prolines at the third position of the tripeptide repeating unit
CC (G-X-Y) are hydroxylated in some or all of the chains.
CC {ECO:0000250|UniProtKB:P08123}.
CC -!- MISCELLANEOUS: These protein fragments were extracted from ancient
CC femur bone collected at Rampart Cave in Arizona, USA and around 28580
CC years old. {ECO:0000269|PubMed:31171860}.
CC -!- SIMILARITY: Belongs to the fibrillar collagen family. {ECO:0000305}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR AlphaFoldDB; C0HLJ4; -.
DR GO; GO:0005615; C:extracellular space; IEA:UniProtKB-SubCell.
DR InterPro; IPR008160; Collagen.
DR Pfam; PF01391; Collagen; 2.
PE 1: Evidence at protein level;
KW Direct protein sequencing; Extinct organism protein; Extracellular matrix;
KW Glycoprotein; Hydroxylation; Secreted.
FT CHAIN 1..1008
FT /note="Collagen alpha-2(I) chain"
FT /id="PRO_0000448471"
FT REGION 1..999
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 187..201
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT MOD_RES 9
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P08123"
FT MOD_RES 12
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P08123"
FT MOD_RES 45
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P08123"
FT MOD_RES 51
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P08123"
FT MOD_RES 116
FT /note="5-hydroxylysine; alternate"
FT /evidence="ECO:0000250|UniProtKB:P08123"
FT MOD_RES 380
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P08123"
FT MOD_RES 383
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P08123"
FT CARBOHYD 116
FT /note="O-linked (Gal...) hydroxylysine; alternate"
FT /evidence="ECO:0000250|UniProtKB:P08123"
FT UNSURE 8
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 31
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 38
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 112
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 124
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 127
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 157
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 206
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 226
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 244
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 253
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 262
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 283
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 337
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 346
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 385
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 391
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 409
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 453
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 476
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 500
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 560
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 579
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 581
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 737
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 785
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 786
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 792
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 794
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 803
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 816
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 853
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 905
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 931
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 934
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 940
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 943
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT UNSURE 946
FT /note="L or I"
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 27..28
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 56..57
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 82..83
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 445..446
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 458..459
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 818..819
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 839..840
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 954..955
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_CONS 992..993
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_TER 1
FT /evidence="ECO:0000303|PubMed:31171860"
FT NON_TER 1008
FT /evidence="ECO:0000303|PubMed:31171860"
SQ SEQUENCE 1008 AA; 90993 MW; DA90E6C6C61B9BDD CRC64;
GGFDFSFLPQ PPQEKGHDGG RYYRAKQGVG LGPGPMGLMG PRGPPGASGA PGPQGFPAGE
PGEPGQTGPA GARGPAGPPG KADGHPGKPG RPGERGVVGP QGARGFPGTP GLPGFKGIRG
HNGLDGLKGQ PGAPGVKGEP GAPGENGTPG QTGARGLPGE RGRVGAPGPA GARGSDGSVG
PVGPAGPIGS AGPPGFPGAP GPKGELGPVG NTGPSGPAGP RGEQGLPGVS GPVGPPGNPG
ANGLTGAKGA AGLPGVAGAP GLPGPRGIPG PVGASGATGA RGLVGEPGPA GSKGESGNKG
EPGSAGPQGP PGSSGEEGKR GPNGESGSTG PTGPPGLRGG PGSRGLPGAD GRAGVIGPAG
ARGASGPAGV RGPSGDTGRP GEPGLMGARG LPGSPGNVGP AGKEGPAGLP GIDGRPGPIG
PAGARGEAGN IGFPGPKGPA GDPGKGEKGH AGLAGNRGQG GKGEQGPAGP PGFQGLPGPA
GTTGEAGKPG ERGIPGEFGL PGPAGPRGER GPPGESGAVG PSGAIGSRGP SGPPGPDGNK
GEPGVVGAPG TAGPAGSGGL PGERGAAGMP GGKGEKGELG LRGEVGTTGR DGARGAPGAV
GAPGPAGATG DRGEAGAAGP AGPAGPRGSP GERGEVGPAG PNGFAGPAGA AGQPGAKGER
GTKGPKGENG IVGPTGPVGS AGPAGPNGPA GPAGSRGDGG PPGATGFPGA AGRTGPPGPS
GITGPPGPPG AAGKEGLRGP RGDQGPVGRT GETGAGGPPG FTGEKGPSGE PGTAGPPGTA
GPQGLLGAPG ILGLPGSRGE RGLPGVAGAV GEPGPLGIGP PGARGPPGAV GSPGVNGAPG
NPGSDGPPGR DGLPGHKGER GYAGNAGPVG AAGAPGPHGT VGPAGKHGNR GEPGPVGSVG
PVGALGPRGP SGPQGIRGDK GEPGDKGPRG LPGLKGHNGL QGLPGLAGQH GDQGPGPVGP
AGPRGPAGPS GPAGKDGRTG HPGAVGPAGI RGSGGGYDFG YEGDFYRA