Assigning Suffixes

How the IPD-IMGT/HLA Database assigns different suffixes to HLA alleles

Suffixes are applied to certain alleles to indicate various levels of expression. There is no single definitive rule for assigning each of these suffixes, their use is guided by established principles based on expression studies and our current understanding of allele expression. As new data become available, suffix assignments may be revised. A list of alleles that have undergone suffix changes and the reason for the change is available here: https://hla.alleles.org/pages/genes/suffix_changes/.

Questionable Alleles

The `Q` suffix is used with alleles that are considered to have questionable expression. We believe that their sequence may or may not be expressed, sufficient work has not been carried out to determine this yet. This can be caused by a number of different types of mutations, a few examples are below:


  • Class I alleles which have a non-synonymous mutation in amino acid 101 or 164, as this affects the disulphide bond, altering conformation, which may affect expression.
  • In-frame indels, indels which result in the addition or deletion of amino acids which doesn't cause a frameshift and premature stop. In some cases if this is a small change, it may or may not affect expression.
  • The sequence contains a premature stop that occurs in the transmembrane domain, we are unsure if the sequence has enough of an anchor to attach to the cell membrane, we are therefore unsure if it is expressed or not.
  • Sequences with non-synonymous mutations in the start codon, which may affect how the protein starts, and if it has enough length to anchor. If there is another methionine in exon 1, we will assign a `Q` suffix, if there is no methionine in exon 1 we will assume a `N` suffix.

In this example, the next ATG is located in the leader sequence, we therefore assume this methionine will be the next start codon. As this is only 3 amino acids down from the canonical start, it may be expressed. If expression analysis is not carried out to determine if the allele is expressed, we will assign this allele a `Q` suffix.


A list of all of currently named `Q` alleles is available here.


Splice site variants

Alleles with splice site mutations typically have a mutation in the intron side of an exon/intron boundary. This affects how the sequenced is spliced, which may result in exonic sequence being removed or intronic sequence being added. As we are unsure of the effect of these mutations, we will typically assign a `Q` suffix to alleles with a mutation in the first two or last two bases of an intron, as this is considered to be the region which affects splicing in HLA alleles.



Like with other alleles, submitters to the database will carry out the necessary work and find out the translated sequence, or carry out expression analysis to determine if the allele is expressed or not, which allows us to mark this up more accurately in the database.


HLA-DQB1 is a unique case amongst HLA genes. As exon 5 is variably expressed, this is the result of a mutation in the splice site preceeding exon 5, but we know this does not affect expression, and is found in many HLA-DQB1 alleles. As a result, we do not assign a `Q` suffix to these alleles.

Null Alleles

The `N` suffix is used for alleles which have been shown or are inferred to not be expressed. This is either due to work carried out by the submitter to show that the allele is not expressed, or if the sequence in the allele is believed to result in a null allele as a result of its inferred protein structure. This can be caused by a number of different types of mutations, a few examples are below:


  • The most common null alleles are alleles with premature stop codons, either caused by a point mutation resulting in a premature stop, or an indel which causes a frameshift and premature stop. There is some nuance in the assigning of the `N` suffix, as a general rule if a point mutation results in a premature stop that occurs within the Alpha one, two or three domains (Class I) or within the Alpha one, two or Beta one, two domains (class II), we will infer the protein sequence is null. This is because we infer that the protein does not have a 3` end to anchor into the cytoplasm, resulting in a null allele. If an indel which results in a frameshift initiates within these domains occurs, we will also infer a null suffix.
  • Alleles with mutations in the starting methionine amino acid, if there are no other methionine amino acids in the leader sequence, we infer that the protein cannot start correctly and will result in a Null allele.
  • Some alleles have mutations in the splice site (see splice site variants). By default, these are given a `Q` suffix, however, if expression analysis is carried out and the allele is shown to be non-expressed, it is then assigned an `N` suffix. Expression analysis is encouraged to provide the amino acid sequence for accurate mark-up, as splice mutations may cause exons to be incorrectly spliced in or out, it leads to an altered or absent protein sequence, resulting in a null allele.
    Although this results in a new protein sequence, the underlying mutation which caused it is the result of an intronic mutation, as a result we name it as a novel intronic variant. An example of this is A*01:01:01:02N.
  • Some sequences have mutations which cause a frameshift and additional sequence past the usual stop codon.

Some null alleles appear to share identical amino acid sequence up to the stop codon, but have different assigned protein names. This is due to the inferred sequence after the stop codon being different. An example of which is below, where the amino acid sequence is identical, however, the sequence after the stop codon differs drastically. Which is why they have received different protein names.


A list of all of currently named `N` alleles is available here.

Other Suffixes

These alleles have mutations which result in either Low expression, they are secreted etc etc, work has been carried out to show the aberrant expression compared to it being a questionable, null or normally expressed allele.


The suffix `L` is used to indicate an allele which has been shown to have `Low` cell surface expression when compared to normal levels. The `S` suffix is used to denote an allele specifying a protein which is expressed as a soluble, `Secreted` molecule but is not present on the cell surface. The `C` suffix is assigned to alleles that produce proteins that are present in the `Cytoplasm` and not on the cell surface. An `A` suffix indicates an `Aberrant` expression where there is some doubt as to whether a protein is actually expressed.


As of December 2025 no alleles have been named with the `C` or `A` suffixes.

Last updated: 08-Jan-2026