This page was produced as an assignment for Genetics 677, an undergraduate course at UW-Madison.

Scroll down to learn more about protein sequence, properties, isoforms, post-translational modifications, and trypsin cleavage sites.

Protein Sequence


The 915 amino acid sequence shown above represents the predominant variant, called isoform a, of the protein encoded by the human GRM7 gene (Entrez Protein, 2009). This protein is a precursor to a Group III metabotropic glutamate receptor involved in glutamate signaling in the central nervous system (Entrez Protein, 2009).

Protein accession number: NP_000835.1

GI number: 4504147
UniProtKB/Swis-Prot accession number: Q14831

Protein Properties

Molecular weight: 102,251 Da (PhosphoSitePlus)
Basal Isoelectric point: 8.2 (PhosphoSitePlus)
Subcellular localization: Multipass membrane protein (Sprenger, 2008; Fink, et al., 2006)

Protein Isoforms

To determine the isoforms of the GRM7 protein, GRM7 was searched on the Entrez Protein and UniProt databases. Entrez Protein returned two isoforms: isoform a and isoform b. Isoform a, which was identified as the predominant isoform, is shown above (Entrez Protein, 2009). Isoform b, the only alternative identified by Entrez Protein, differs from isoform a in its 3' coding region, where it contains an alternate segment  that causes a frameshift in the C-terminus of the GRM7 protein (Entrez Protein, 2009). UniProt returned these two isoforms, which it termed variants 1 and 2, as well as three other novel isoforms, which it termed variants 3, 4, and 5 (The UniProt Consortium, 2008). Like variant 2 (isoform b), these variants 3, 4, and 5 also differed from variant 1 (isoform a) only in their C-termini (see Table 1 and Figure 1) (The UniProt Consortium, 2008).

Table 1. Isoforms of the GRM7 protein found using UniProt. Five isoforms of the GRM7 protein were found using the UniProt database. All isoforms resembled isoform 1, referred to above as isoform a, in their first 899 amino acids. The sequences of the five isoforms after amino acid position 899 are given in the table below (The UniProt Consortium, 2008).




Amino acid sequence at positions 900+ of the protein












Figure 1. ClustalW alignment of the C-terminus ends of the five isoforms of the GRM7 protein. The five isoforms of the GRM7 protein, which were found using Uniprot, were aligned on the UniProt website using ClustalW. Q14831, Q14831-2, Q14831-3, Q14831-4, and Q14831-5 represent isoforms 1 (also referred to as isoform a), 2 (also referred to as isoform b), 3, 4, and 5, respectively. As shown in the figure, the isoforms differ from amino acid positions 900 onward (The UniProt Consortium, 2008).

Post-translational Modifications

To determine post-translational modifications of the GRM7 protein, GRM7 was searched on dbPTM as well as on UniProt. While no experimentally validated post-translational modifications were found on either database, several predicted sites of post-translational modification were returned. Specifically, dbPTM showed 13 predicted phosphorylation sites, three predicted S-palmitoyl cysteine sites, two predicted nitrogen-linked glycosylation sites, and one predicted oxygen-linked N-acetylglucosamine site (Table 2) (Lee, et al., 2006). Interestingly, none of the post-translational modifications identified by dbPTM overlapped with those identified by UniProt, which showed an additional phosphorylation site as well as four nitrogen-linked glycosylation sites (Table 3) (The UniProt Consortium, 2008; Lee, et al., 2006).
From these predicted post-translational modifications, two primary inferences can be made about how GRM7 functions in the cell. First, based on the predicted presence of several phosphorylation sites, which are
most often used by the cell to turn proteins "on" and "off", GRM7 protein activity is likely to be highly regulated at the post-translational level (The UniProt Consortium, 2008; Lee, et al., 2006). Second, the additional presence of glycosylation sites suggests that at least part of GRM7 extends extracellularly (The UniProt Consortium, 2008; Lee, et al., 2006). Knowing that GRM7 is a membrane-bound glutamate receptor, it makes sense that it would be post-transcriptionally regulated as well as partly extracellular; however, experimental evidence will be needed to validate these predicted sites of post-translational modification (Entrz Protein, 2009).

Table 2. Post-translational modifications of GRM7 as predicted by the dbPTM database. The first column of the table indicates the position of the modified amino acid, while the second column gives the modification that occurs at that site. The third column indicates the amino acid sequence surrounding the modification site, with the modified amino acid shown in blue. Likewise, the fourth column shows the secondary structure surrounding the modified amino acid, with the modified part of the secondary structure in bold. Finally, the fifth column gives the source from which the prediction was taken. In this case all site predicitons were based on a hidden markov model (HMM) used by the dbPTM database to predict sites of post-translational modification. (Lee, et al., 2006).

Table 3. Post-translational modifications of GRM7 as predicted by the UniProt database. In the table below, the first column indicates broadly what type of post-translational modification occurs at the site listed in the second column, while the third column gives a more specific description of the modification. The orange rectangles in the third column indicate the type of evidence each post-translation modification is based on. In this particular case, all the post-translational modifications listed in the table are based on predicition, rather than on experimental evidence. Finally, the fourth column shows the position on of the modification on the GRM7 protein: the left of the black line corresponds to amino acid position 0, whiel the right of the black line corresponds to amino acid position 915. (The UniProt Consortium, 2008).

Trypsin Cleavage Sites

To determine where the enzyme trypsin would cut GRM7, the protein sequence was input into ExPASy PeptideCutter. In proteomics studies seeking simply to identify all the proteins in a given sample, a trypsin digestion is often one of the first steps taken. A later step involving mass spectrometry is then used to identify proteins from the peptide fragments produced during the trypsin digestion. Thus, understanding how proteins are cleaved by trypsin is integral to identifying where and when the proteins are expressed. ExPASy PeptideCutter showed that GRM7 contained 91 trypsin cleavage sites. The sites identified by ExPASy PeptideCutter are listed below. Position 1 refers to the N-terminus of GRM7, while postion 915 refers to the C-terminus.

Trypsin cleavage sites, as identified by
ExPASy PeptideCutter:
5 6 9 15 33 44 60 71 72 78 104 111 129 135 149 170 190 191 197 214 233 243 255 260 261 263 269 272 281 293 300 301 319 339 340 353 359 360 375 382 383 388 389 395 398 407 427 434 446 447 450 453 469 476 493 513 516 532 533 534 537 573 583 614 622 626 658 659 676 679 682 688 689 695 736 744 748 776 778 817 844 858 859 860 861 864 875 889 904 905 906




dbPTM. (2007). Metabotropic glutamate receptor 7 precursor (mGluR7). Retrieved April 25, 2009, from

Entrez Protein. (2009). glutamate receptor, metabotropic 7 isoform a precursor [Homo sapiens]. Retrieved February 27, 2009, from,

Entrez Protein. (2009). glutamate receptor, metabotropic 7 isoform b precursor [Homo sapiens]. Retrieved April 25, 2009, from,

Fink, J.L., Aturaliya, R.N., Davis, M.J., Zhang, F., Hanson, K., Teasdale, M.S., Kai, C., Kawai, J., Carninci, P., Hayashizaki, Y., Teasdale, R.D. (2006). LOCATE: a mouse protein subcellular localization database. Nucleic Acids Research, 34(Database issue):D213. doi:10.1093/nar/gkj069

Lee, T.Y., Huang*, H.D., Hung, J.H., Huang, H.Y., Yang, Y.S., and Wang, T.H. (2006). dbPTM: An information repository of protein post-translational modification. Nucleic Acids Research, 34(Database Issue):D622. doi:10.1093/nar/gkj083

PhosphoSitePlus. (2007). mGluR7 (human). Retrieved April 25, 2009, from

Sprenger, J., Fink, J.L., Karunaratne, S., Hanson, K., Hamilton, N.A., Teasdale, R.D. (2008). LOCATE: a mammalian protein subcellular localization database. Nucleic Acids Research, 36(Database issue):D230. doi:10.1093/nar/gkm950.

The UniProt Consortium. (2008). The Universal Protein Resource (UniProt). Nucleic Acids Research, 36(Database Issue):D190. doi:10.1093/nar/gkm895

Jennifer Wagner
Updated May 14, 2009