COLL4_MIMIV
ID COLL4_MIMIV Reviewed; 817 AA.
AC Q5UPS7;
DT 13-SEP-2005, integrated into UniProtKB/Swiss-Prot.
DT 07-DEC-2004, sequence version 1.
DT 29-SEP-2021, entry version 62.
DE RecName: Full=Collagen-like protein 4;
GN OrderedLocusNames=MIMI_R240;
OS Acanthamoeba polyphaga mimivirus (APMV).
OC Viruses; Varidnaviria; Bamfordvirae; Nucleocytoviricota; Megaviricetes;
OC Imitervirales; Mimiviridae; Mimivirus.
OX NCBI_TaxID=212035;
OH NCBI_TaxID=5757; Acanthamoeba polyphaga (Amoeba).
RN [1]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Rowbotham-Bradford;
RX PubMed=15486256; DOI=10.1126/science.1101485;
RA Raoult D., Audic S., Robert C., Abergel C., Renesto P., Ogata H.,
RA La Scola B., Susan M., Claverie J.-M.;
RT "The 1.2-megabase genome sequence of Mimivirus.";
RL Science 306:1344-1350(2004).
CC -!- FUNCTION: May participate in the formation of a layer of cross-linked
CC glycosylated fibrils at the viral surface thus giving it a hairy-like
CC appearance. {ECO:0000305}.
CC -!- SUBCELLULAR LOCATION: Virion.
CC -!- PTM: May be hydroxylated on lysine by the viral-encoded procollagen-
CC lysine,2-oxoglutarate 5-dioxygenase. {ECO:0000305}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; AY653733; AAV50513.1; -; Genomic_DNA.
DR RefSeq; YP_003986736.1; NC_014649.1.
DR PRIDE; Q5UPS7; -.
DR GeneID; 9924847; -.
DR KEGG; vg:9924847; -.
DR Proteomes; UP000001134; Genome.
DR InterPro; IPR008160; Collagen.
DR Pfam; PF01391; Collagen; 3.
PE 4: Predicted;
KW Collagen; Glycoprotein; Hydroxylation; Reference proteome; Repeat; Virion.
FT CHAIN 1..817
FT /note="Collagen-like protein 4"
FT /id="PRO_0000059419"
FT DOMAIN 83..142
FT /note="Collagen-like 1"
FT DOMAIN 145..264
FT /note="Collagen-like 2"
FT DOMAIN 268..327
FT /note="Collagen-like 3"
FT DOMAIN 352..411
FT /note="Collagen-like 4"
FT DOMAIN 430..489
FT /note="Collagen-like 5"
FT DOMAIN 512..570
FT /note="Collagen-like 6"
FT REGION 87..107
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 120..458
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 479..543
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 757..804
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 290..342
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 379..393
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 427..457
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 518..543
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT CARBOHYD 106
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 121
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 183
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 345
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 360
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 483
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 709
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 712
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 715
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 772
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
SQ SEQUENCE 817 AA; 79512 MW; 116D3B2C35711E04 CRC64;
MANNVNLLYG LPQRVITLPA NLIQSKSKIF IGNGDPNCII GNNCDMYIDK NSNRIFYKCG
CNWLLTGTIM GDNGFIGLRG SKGDTGNKGE IGDNGENGDI GQIGDNGSIG SKGIKGMNGF
NGSKGIKGDK GTDGIKGDKG QDNFGSKGQK GETGSKGDDG IKGITGSKGF KGDPGTKGEN
GINGTKGLKG SQGDLGTKGD DGIKGIIGSK GIKGDPGNKG EDGIKGTNGL KGSKGETGSK
GDDGTKGITG LKGTKGNSGS KGDDGDKGIQ GLKGEFGTKG NVGDKGDTGI NGEKGSDGDK
GNKGLDGIKG DLGDDGIKGD KGIKGLKGDT GNSDKGDKGS KGDSNFSKGG IGDKGSKGDN
GSKGESGDKG IFGLKGSKGD IGDKGEKGDL GDTGLKGSKG LKGSKGDKGL VNVKGENGFV
GDLGSKGSKG DKGESGDKGD IGIKGDKGAK GVTGDKGDKG TKGFIGNVGF KGDTGDKGII
GDNGSKGIKG SSNNKGDKGD KGNTGDKGIT NTKGIKGDKG IKGSKGDLGS VGEKGEKGTK
GDIGTKGETG LKGIIGDKGE LGSKGIKGLS ESFIESFYQE IPGTFVSTVP DGAVFGYLSA
AGGGGGGGGI ELSSLAGGGG GGSGCFYLLP LTVYPGSQFT GTIGQGGSGS TISAVLATKG
SDTVINYGTL TFIAHGGFPG SSTLELGGNG DTVTYPLPVT PAPGGTGGNM TNGSNGSTSI
FMFSGAGGGA SGLNSGFNGG NVGPYVGGTS SPQIAGGGGG ASAFGNGGRG GNTTQAATKG
EYGSGGGGGS EFSPSGSTNG GDGGDGFVRI DYFMVPR