COLL2_MIMIV
ID COLL2_MIMIV Reviewed; 1595 AA.
AC Q5UQ13;
DT 13-SEP-2005, integrated into UniProtKB/Swiss-Prot.
DT 07-DEC-2004, sequence version 1.
DT 23-FEB-2022, entry version 62.
DE RecName: Full=Collagen-like protein 2;
GN OrderedLocusNames=MIMI_R196;
OS Acanthamoeba polyphaga mimivirus (APMV).
OC Viruses; Varidnaviria; Bamfordvirae; Nucleocytoviricota; Megaviricetes;
OC Imitervirales; Mimiviridae; Mimivirus.
OX NCBI_TaxID=212035;
OH NCBI_TaxID=5757; Acanthamoeba polyphaga (Amoeba).
RN [1]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Rowbotham-Bradford;
RX PubMed=15486256; DOI=10.1126/science.1101485;
RA Raoult D., Audic S., Robert C., Abergel C., Renesto P., Ogata H.,
RA La Scola B., Susan M., Claverie J.-M.;
RT "The 1.2-megabase genome sequence of Mimivirus.";
RL Science 306:1344-1350(2004).
CC -!- FUNCTION: May participate in the formation of a layer of cross-linked
CC glycosylated fibrils at the viral surface thus giving it a hairy-like
CC appearance. {ECO:0000305}.
CC -!- SUBCELLULAR LOCATION: Virion.
CC -!- PTM: May be hydroxylated on lysine by the viral-encoded procollagen-
CC lysine,2-oxoglutarate 5-dioxygenase. {ECO:0000305}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; AY653733; AAV50469.1; -; Genomic_DNA.
DR RefSeq; YP_003986692.1; NC_014649.1.
DR GeneID; 9924803; -.
DR KEGG; vg:9924803; -.
DR Proteomes; UP000001134; Genome.
DR InterPro; IPR008160; Collagen.
DR Pfam; PF01391; Collagen; 6.
PE 4: Predicted;
KW Collagen; Glycoprotein; Hydroxylation; Reference proteome; Repeat; Virion.
FT CHAIN 1..1595
FT /note="Collagen-like protein 2"
FT /id="PRO_0000059417"
FT DOMAIN 97..155
FT /note="Collagen-like 1"
FT DOMAIN 175..233
FT /note="Collagen-like 2"
FT DOMAIN 236..295
FT /note="Collagen-like 3"
FT DOMAIN 299..358
FT /note="Collagen-like 4"
FT DOMAIN 380..559
FT /note="Collagen-like 5"
FT DOMAIN 608..907
FT /note="Collagen-like 6"
FT DOMAIN 920..1039
FT /note="Collagen-like 7"
FT DOMAIN 1043..1102
FT /note="Collagen-like 8"
FT DOMAIN 1128..1307
FT /note="Collagen-like 9"
FT REGION 181..577
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 604..1326
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1538..1585
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 213..266
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 284..499
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 509..557
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 606..704
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 720..884
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 899..1097
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1105..1253
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1263..1299
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1548..1562
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT CARBOHYD 87
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 134
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 274
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 280
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 286
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 373
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 382
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 400
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 409
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1345
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1420
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1545
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
SQ SEQUENCE 1595 AA; 158703 MW; 03904D883333944F CRC64;
MTYIYNIYMS YSRGNLDRYI DQRLNRFYDQ LISQRYAYSI PGIPGIKGDK GLTGSRIFLR
NGIPQNELGT NGDLYINLLT DDLYFKNDSF WSLVGNLRGE KGEKGQLGIM GYKGEKGEIG
SQGIKGMKGS DGLNGSTILF GQGLPRPYEG ENGDVYIDKN TGIMYKKING IWIPQVGLKG
SQGDQGYKGD QGSKGDKGQK GEFGSAGFKG DKGDMGQKGE TGAKGDKGDK GEGSKGSKGD
VGNKGDKGNK GDKGIKGDKG SEGIKGDNGI KGDNGTKGDN GTKGDNGTKG DKGDIGDNGI
KGDKGDIGDN GIKGDKGNKG DNGDKGNKGD KGDIGDKGMK GDKGDIGDKG DIGDKGMKGD
KGDIGDKGMK GDNSTKGDKG DNGTKGDKGD NGIKGDKGDN GTKGDNGDNG TKGDKGDNGI
KGDKGDKGTK GDKGDKGTKG DNGDKGTKGD NGIKGYKGDI GDKGIKGESG ANADKGDKGI
KGDKGDKGIK GDDGSKGDKG YNGEIGQKGD NGEKGDNGEK GDNGEKGDKG EKGDIGEKGD
NGEKGDIGEK GNKGSKGDKG EIGSSILFGQ GIPSPDLGND GDIYIDDNTG ILYKKLNGIW
VPQTDIKGEK GDKGESGQSA NKGDKGDKGN GGEIGNKGDK GSKGDIGDKG NKGDKGDGGI
KGNKGDKGSK GDKGSKGDKG DKGDEGIKGD KGNKGDKGDK GDIGSQGIKG ESGSAVFKGD
KGTKGDKGNK GDKGNKGDKG TNGDKGNKGD KGSKGDKGTK GDKGIKGDKG DKGSKGNKGS
KGDKGDKGDS GDKGDKGDKG SKGYKGDKGD KGSKGYKGDK GDKGIKGNTG SKGDKGSKGD
KGEKGSKGNK GEKGEKGFKG EKGSKGEKGS KGNKGDKGDK GFKGDNGIKG NIGVKGDKGD
SGIKGENGLK GDVGDKGIKG DKGNEGDKGD KGNKGEKGNR GDEGDKGIKG NKGDKGIKGS
EGDKGIKGES GSKGDKGEKG NKGYKGDKGD KGNLGIKGDK GDKGIKGVKG TKGDKGTKGV
KGTKGDKGDK GTNGDKGDKG IKGTNGDKGN KGLEGDKGNI GGKGDKGDKG DKGDKGDKGD
KGVNGDKGSK GDKGDQGTKG ETGLSIKGDK GDKGEFGLSI KGDKGVKGDQ GYKGDKGDKG
IKGDKGDKGI KGDQGIKGNK GDKGDKGNLG DKGDKGIKGD KGIKGDKGIK GDKGIKGDKG
IKGDKGDKGI KGDKGDKGDK DDKGNKGDKG DKGDKGIKGD KGDKGDKGDQ GDQGIKGESG
ASVFKGDKGD KGDKGDKGDK GDKGAKGDKG DKGDKGDQGI KGESGASVFK GDKGDTGSQG
DKGIKGESGV SLNYVMSYYN ATPGNFSSPV PVGASIAYVS TVGGGGGGSY FRRLNVGIVG
GGGGGGGGAL FRLPLSVMPG QSLSGVIGGP GLGAADSTTN ATKGGDTIIY YGQYTFIAGG
GNPGINSAAD TASFIKGGDG GTVTNPLLTT QPTPGTGSTS TGSAGGNGQM SFYCFSGAGG
GAGTSFTSSS GGNVGMFPGG NGVTTTYNNA GSGGGGASAF DKGGNGSIRF NPPSSGTKGS
GGGGSVQGGG GTIPNDGYPG GNGGPGFVSI DYYSS