COLL1_MIMIV
ID COLL1_MIMIV Reviewed; 945 AA.
AC Q5UPE4;
DT 13-SEP-2005, integrated into UniProtKB/Swiss-Prot.
DT 07-DEC-2004, sequence version 1.
DT 03-AUG-2022, entry version 63.
DE RecName: Full=Collagen-like protein 1;
GN OrderedLocusNames=MIMI_L71;
OS Acanthamoeba polyphaga mimivirus (APMV).
OC Viruses; Varidnaviria; Bamfordvirae; Nucleocytoviricota; Megaviricetes;
OC Imitervirales; Mimiviridae; Mimivirus.
OX NCBI_TaxID=212035;
OH NCBI_TaxID=5757; Acanthamoeba polyphaga (Amoeba).
RN [1]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Rowbotham-Bradford;
RX PubMed=15486256; DOI=10.1126/science.1101485;
RA Raoult D., Audic S., Robert C., Abergel C., Renesto P., Ogata H.,
RA La Scola B., Susan M., Claverie J.-M.;
RT "The 1.2-megabase genome sequence of Mimivirus.";
RL Science 306:1344-1350(2004).
CC -!- FUNCTION: May participate in the formation of a layer of cross-linked
CC glycosylated fibrils at the viral surface thus giving it a hairy-like
CC appearance. {ECO:0000305}.
CC -!- SUBCELLULAR LOCATION: Virion.
CC -!- PTM: May be hydroxylated on lysine by the viral-encoded procollagen-
CC lysine,2-oxoglutarate 5-dioxygenase. {ECO:0000305}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; AY653733; AAV50346.1; -; Genomic_DNA.
DR RefSeq; YP_003986560.1; NC_014649.1.
DR PRIDE; Q5UPE4; -.
DR GeneID; 9924664; -.
DR KEGG; vg:9924664; -.
DR Proteomes; UP000001134; Genome.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR021210; Exosporium_BclB.
DR Pfam; PF01391; Collagen; 8.
DR TIGRFAMs; TIGR03721; exospore_TM; 1.
PE 4: Predicted;
KW Collagen; Glycoprotein; Hydroxylation; Reference proteome; Repeat; Virion.
FT CHAIN 1..945
FT /note="Collagen-like protein 1"
FT /id="PRO_0000059416"
FT DOMAIN 83..142
FT /note="Collagen-like 1"
FT DOMAIN 146..205
FT /note="Collagen-like 2"
FT DOMAIN 257..376
FT /note="Collagen-like 3"
FT DOMAIN 383..442
FT /note="Collagen-like 4"
FT DOMAIN 488..547
FT /note="Collagen-like 5"
FT DOMAIN 554..613
FT /note="Collagen-like 6"
FT DOMAIN 635..694
FT /note="Collagen-like 7"
FT REGION 80..226
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 257..441
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 488..712
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 733..768
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 117..209
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 257..440
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 488..686
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 737..757
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT CARBOHYD 211
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 442
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 716
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
SQ SEQUENCE 945 AA; 93884 MW; E6432D6EDB7A7BC8 CRC64;
MSRITCPITD CKCKCNKNNC VYCVMGRQGL PGPKGSSGNS IYVGTGVPSP FLGNNGDLYI
DSSTGLLYAK VNGVWVPQGS LKGDPGASGS KGEKGDKGSS GEAGLKGEQG TKGEQGDQGE
QGDKGDKGDK GDVGAKGDQG DKGDQGDVGA KGDQGDKGDQ GDVGAKGDQG DKGDKGDQGD
KGDVGDPGVK GDKGDTGDKG DKGDKGDKGQ NGSEILFGLG IPSPDLGEDG DVYIDTLTGN
VYQKIGGVWV LETNIKGEKG DQGDKGDTGS KGDQGDKGDQ GDKGDQGDKG DVGDKGNKGD
TGSKGDVGDK GDVGDKGDKG DTGDKGDKGD TGDKGDKGDV GDKGDKGDVG DKGDVGDKGD
VGDKGDKGDT GDKGDKGDIG DKGDKGDIGD KGDKGDIGDK GDKGDVGDKG DKGDKGDIGD
KGDKGDIGDK GDKGDKGDKG ENGSGILFGL GIPSPDLGED GDIYIDTLTG NVYQKIGGVW
VLETSIKGEK GDKGDTGDKG DTGDKGDTGD KGDTGDKGDT GDKGDVGDKG DVGDKGDVGD
KGDVGDKGDK GDIGDKGDKG DLGDKGDKGD VGDKGDVGDK GDKGDIGDKG DKGDLGDKGD
KGDVGDKGDK GDVGDKGDKG DIGDKGDKGD VGDKGDKGDI GDKGDKGDKG DVGSKGDKGD
KGDVGDKGDK GDVGSKGDKG DKGDKGDVGP VGASILFGAG VPSPTTGENG DSYIDNSTGV
FYLKINDVWV PQTNIKGDKG DKGDKGDKGD KGDTGDVGLK GDTGTPGSGP IIPYSSGLTP
VALAVVAVAG GGIADTGASY DFGVSSPSVT LVGVNLDFTG PVQGLLPNMA WSAPRDTVIT
SLATAFQVSV AISAVLEPIF LRTQVYRELA ANPGVFEPLA GAIVEFDVAS SALISVGTVF
RGIVTGLSIP VNAGDRLIVF ANTRTTSLIS VGTVTGFISS GLALA