bioworld

Sunday, February 24, 2008

GENERATION OF ANTIBODY DIVERSITY

The immune system has the capacity to recognize and respond to about 10⁷ different antigens. This extreme diversity can be generated in at least three possible ways:

Multiple genes in the germ line DNA.
Variable recombination during the differentiation of germ line cells into B-cells.
Mutation during the differentiation of germ line cells into B-cells.

It is known that all three of these possibilities take place to produce antibody diversity. The following figures illustrate these possibilities:

Antibody Diversity: Multiple Genes

The figure shows the genetic makeup of a germ line cell and a mature B-cell at the loci controlling heavy chain production. Germ line DNA has many (up to 200) different variable (V) region genes, in addition to 12 diversity (D) region genes and four joining (J) region genes. During differentiation of this cell into the B-cell, rearrangement of the DNA occurs. This rearrangement aligns one of the many V genes with one of the D genes and one of the J genes, producing a functional VDJ recombinant gene. Since any of the genes may recombine with any others, this rearrangement has the potential to generate 200 x 12 x 4 = 9600 different possible combinations. The same type of event occurs in the genes encoding the immmunoglobulin light chains where about 200 different V regions may recombine with about 5 different J regions giving rise to 200 x 5 = 1000 possible light chains. Since in any particular B-cell, any light chain combination can occur along with any heavy chain combination, the total possible immunoglobulin combinations approaches 10⁷ (9600 x 1000).
A second way that diversity can result is through a process of variable or "inaccurate" recombination. The figure illustrates three possible recombination events between the variable (V) and joining (J) regions of an immunoglobulin light chain. In the first event, a proline-tryptophan dipeptide sequence is produced in the resulting protein. However, in the second and third events, differential recombination places proline-arginine or proline-proline sequences into the resulting immunoglobulin. These types of events may also occur between the V and D regions and the D and J regions of the heavy chain DNA sequence.
A third way that diversity can result is through a process of mutation. This process simply involves changes in DNA sequence that occur during differentiation of the B-cell. The figure illustrates how an A:T to G:C transition mutation could change a serine residue into a glycine residue in the resulting immunoglobulin. This process may, in part, explain the diversity observed in hypervariable (CDR) regions.

IMMUNOGLOBULIN PRODUCTION

The production of immunoglobulins by B-cells or plasma cells occurs in different stages. During differentiation of the B-cells from precursor stem cells, rearrangement, recombination and mutation of the immunoglobulin V, D, and J regions occurs to produce functional VJ (light chain) and VDJ (heavy chain) genes. At this point, the antigen specificity of the mature B-cell has been determined. Each cell can make only one heavy chain and one light chain, although the isotype of the heavy chain may change. Initially, a mature B-cell will produce primarily IgD (and some membrane IgM) that will migrate to the cell surface to act as the antigen receptor. Upon stimulation by antigen, the B-cell will differentiate into a plasma cell expressing large amounts of secreted IgM. Some cells will undergo a "class switch" during which a rearrangement of the DNA will occur, placing the VDJ gene next to the genes encoding the IgG, IgE or IgA constant regions. Upon secondary induction (i.e. the secondary response), these B-cells will differentiate into plasma cells expressing the new isotype. Most commonly, this results in a switch from IgM (primary response) to IgG (secondary response). The factors that lead to production of IgE or IgA instead of IgG are not well understood.

Immunoglobulins

Immunoglobulins generally assume one of two roles: immunoglobulins may act as i) plasma membrane bound antigen receptors on the surface of a B-cell or ii) as antibodies free in cellular fluids functioning to intercept and eliminate antigenic determinants. In either role, antibody function is intimately related to its structure and this page will introduce immunoglobulins (antibodies) and relate their structure to their function in host defense.

BASIC IMMUNOGLOBULIN STRUCTURE

Immunoglobulins are composed of four polypeptide chains: two "light" chains (lambda or kappa), and two "heavy" chains (alpha, delta, gamma, epsilon or mu). The type of heavy chain determines the immunoglobulin isotype (IgA, IgD, IgG, IgE, IgM, respectively). Light chains are composed of 220 amino acid residues while heavy chains are composed of 440-550 amino acids. Each chain has "constant" and "variable" regions as shown in the figure. Variable regions are contained within the amino (NH₂) terminal end of the polypeptide chain (amino acids 1-110). When comparing one antibody to another, these amino acid sequences are quite distinct. Constant regions, comprising amino acids 111-220 (or 440-550), are rather uniform, in comparison, from one antibody to another, within the same isotype. "Hypervariable" regions, or "Complementarity Determining Regions" (CDRs) are found within the variable regions of both the heavy and light chains. These regions serve to recognize and bind specifically to antigen. The four polypeptide chains are held together by covalent disulfide (-S-S-) bonds.

IgG2 3-Dimensional Structure, Ag binding

Click here to visualize these 3D structures in real time!

Structural differences between immunoglobulins are used for their classification. As stated above, the type of heavy chain an immunoglobulin possesses determines the immunoglobulin "isotype". More specifically, an isotype is determined by the primary sequence of amino acids in the constant region of the heavy chain, which in turn determines the three-dimensional structure of the molecule. Since immunoglobulins are proteins, they can act as an antigen, eliciting an immune response that generates anti-immunoglobulin antibodies. However, the structural (three-dimensional) features that define isotypes are not immunogenic in an animal of the same species, since they are not seen as "foreign". For example, the five human isotypes, IgA, IgD, IgG, IgE and IgM are found in all humans and a result, injection of human IgG into another human would not generate antibodies directed against the structural features (determinants) that define the IgG isotype. However, injection of human IgG into a rabbit would generate antibodies directed against those same structural features.

Another means of classifying immunoglobulins is defined by the term "allotype". Like isotypes, allotypes are determined by the amino acid sequence and corresponding three-dimensional structure of the constant region of the immunoglobulin molecule. Unlike isotypes, allotypes reflect genetic differences between members of the same species. This means that not all members of the species will possess any particular allotype. Therefore, injection of any specific human allotype into another human could possibly generate antibodies directed against the structural features that define that particular allotypic variation.

A third means of classifying immunoglobulins is defined by the term "idiotype". Unlike isotypes and allotypes, idiotypes are determined by the amino acid sequence and corresponding three-dimensional structure of the variable region of the immunoglobulin molecule. In this regard, idiotypes reflect the antigen binding specificity of any particular antibody molecule. Idiotypes are so unique that an individual person is probably capable of generating antibodies directed against their own idiotypic determinants. This probability forms the basis of the Idiotypic Network Hypothesis to be described later.

BASIC IMMUNOGLOBULIN FUNCTION

Antibodies function in a variety of ways designed to eliminate the antigen that elicited their production. Some of these functions are independent of the particular class (isotype) of immunoglobulin. These functions reflect the antigen binding capacity of the molecule as defined by the variable and hypervariable (idiotypic) regions. For example, an antibody might bind to a toxin and prevent that toxin from entering host cells where its biological effects would be activated. Similarly, a different antibody might bind to the surface of a virus and prevent that virus from entering its host cell. In contrast, other antibody functions are dependent upon the immunoglobulin class (isotype). These functions are contained within the constant regions of the molecule. For example, only IgG and IgM antibodies have the ability to interact with and initiate the complement cascade. Likewise, only IgG molecules can bind to the surface of macrophages via Fc receptors to promote and enhance phagocytosis. The following table summarizes some immunoglobulin properties.

Isotype	Placental transfer	Binds mast cell surfaces	Binds phagocytic cell surfaces	Activates complement	Additional features
IgM	-	-	-	+	First Ab in development and response.
IgD	-	-	-	-	B-cell receptor.
IgG	+	-	+	+	Involved in opsonization and ADCC. Four subclasses; IgG1, IgG2, IgG3, IgG4.
IgE	-	+	-	-	Involved in allergic responses.
IgA	-	-	-	-	Two subclasses; IgA1, IgA2. Also found as dimer (sIgA) in secretions.

Friday, July 27, 2007

Transposon

transposons are sequences of DNA that can move around to different positions within the genome of a single cell, a process called transposition. In the process, they can cause mutations and change the amount of DNA in the genome. Transposons are also called "jumping genes", and are examples of mobile genetic elements. Discovered by Barbara McClintock early in her career^[1], the topic went on to be a Nobel winning work in 1983. There are a variety of mobile genetic elements, and they can be grouped based on their mechanism of transposition. Class I mobile genetic elements, or retrotransposons, move in the genome by being transcribed to RNA and then back to DNA by reverse transcriptase, while class II mobile genetic elements move directly from one position to another within the genome using a transposase to "cut and paste" them within the genome. Transposons are very useful to researchers as a means to alter DNA inside of a living organism. Transposons make up a large fraction of genome sizes which is evident through the C-values of eukaryotic species. As an example about 48% of the human genome is composed of transposons and their defunct remnants.

Types of transposons

Transposons are classified into two classes based on their mechanism of transposition.

Class I: Retrotransposons

Retrotransposons work by copying themselves and pasting copies back into the genome in multiple places. Initially retrotransposons copy themselves to RNA (transcription) but, in addition to being transcribed, the RNA is copied into DNA by a reverse transcriptase (often coded by the transposon itself) and inserted back into the genome.

Retrotransposons behave very similarly to retroviruses, such as HIV, giving a clue to the evolutionary origins of such viruses.

There are three main classes of Retrotransposons:

Viral: encode reverse transcriptase (to reverse transcribe RNA into DNA), have long terminal repeats (LTRs), similar to retroviruses
LINEs: encode reverse transcriptase, lack LTRs, transcribed by RNA polymerase II
Nonviral superfamily: do not code for reverse transcriptase, transcribed by RNA polymerase III

Class II: DNA transposons

The major difference of Class II transposons from retrotransposons is that their transposition mechanism does not involve an RNA intermediate. Class II transposons usually move by cut and paste, rather than copy and paste, using the transposase enzyme. Different types of transposase work in different ways. Some can bind to any part of the DNA molecule, and the target site can therefore be anywhere, while others bind to specific sequences. Transposase makes a staggered cut at the target site producing sticky ends, cuts out the transposon and ligates it into the target site. A DNA polymerase fills in the resulting gaps from the sticky ends and DNA ligase closes the sugar-phosphate backbone. This results in target site duplication and the insertion sites of DNA transposons may be identified by short direct repeats (a staggered cut in the target DNA filled by DNA polymerase) followed by inverted repeats (which are important for the transposon excision by transposase).

Not all DNA transposons transpose through cut and paste mechanism. In some cases a replicative transposition is observed in which transposon replicates itself to a new target site.

Both classes of transposon may lose their ability to synthesise reverse transcriptase or transposase through mutation, yet continue to jump through the genome because other transposons are still producing the necessary enzyme.

Examples

The first transposons were discovered in maize (Zea mays), (corn species) by Barbara McClintock in 1948, for which she was awarded a Nobel Prize in 1983. She noticed insertions, deletions, and translocations, caused by these transposons. These changes in the genome could, for example, lead to a change in the color of corn kernels. About 50% of the total genome of maize consists of transposons. The Ac/Ds system McClintock described are class II transposons.
One family of transposons in the fruit fly Drosophila melanogaster are called P elements. They seem to have first appeared in the species only in the middle of the twentieth century. Within 50 years, they have spread through every population of the species. Artificial P elements can be used to insert genes into Drosophila by injecting the embryo. For the use of P elements as a genetic tool see: "transposons as a genetic tool".
Transposons in bacteria usually carry an additional gene for function other than transposition---often for antibiotic resistance. In bacteria, transposons can jump from chromosomal DNA to plasmid DNA and back, allowing for the transfer and permanent addition of genes such as those encoding antibiotic resistance (multi-antibiotic resistant bacterial strains can be generated in this way). Bacterial transposons of this type belong to the Tn family. When the transposable elements lack additional genes, they are known as insertion sequences.
The most common form of transposon in humans is the Alu sequence. The Alu sequence is approximately 300 bases long and can be found between 300,000 and a million times in the human genome.
Mu phage transposition is the best known example of replicative transposition. Its transposition mechanism is somewhat similar to a homologous recombination.

Transposons causing diseases

Transposons are mutagens. They can damage the genome of their host cell in different ways:

A transposon or a retroposon that inserts itself into a functional gene will most likely disable that gene.
After a transposon leaves a gene, the resulting gap will probably not be repaired correctly.
Multiple copies of the same sequence, such as Alu sequences can hinder precise chromosomal pairing during mitosis, resulting in unequal crossovers, one of the main reasons for chromosome duplication.

Diseases that are often caused by transposons include hemophilia A and B, severe combined immunodeficiency, porphyria, predisposition to cancer, and Duchenne muscular dystrophy.

Additionally, many transposons contain promoters which drive transcription of their own transposase. These promoters can cause aberrant expression of linked genes, causing disease or mutant phenotypes.

Evolution of transposons

The evolution of transposons and their effect on genome evolution is currently a dynamic field of study.

Transposons are found in all major branches of life. They may or may not have originated in the last universal common ancestor, or arisen independently multiple times, or perhaps arisen once and then spread to other kingdoms by horizontal gene transfer. While transposons may confer some benefits on their hosts, they are generally considered to be selfish DNA parasites that live within the genome of cellular organisms. In this way, they are similar to viruses. Viruses and transposons also share features in their genome structure and biochemical abilities, leading to speculation that they share a common ancestor.

Since excessive transposon activity can destroy a genome, many organisms seem to have developed mechanisms to reduce transposition to a manageable level. Bacteria may undergo high rates of gene deletion as part of a mechanism to remove transposons and viruses from their genomes while eukaryotic organisms may have developed the RNA interference (RNAi) mechanism as a way of reducing transposon activity. In the nematode Caenorhabditis elegans, some genes required for RNAi also reduce transposon activity.

Transposons may have been co-opted by the vertebrate immune system as a means of producing antibody diversity. The V(D)J recombination system operates by a mechanism similar to that of transposons.

Evidence exists that transposable elements may act as mutators in bacteria.

Applications

Transposons were first discovered in the plant maize (Zea mays, corn species), which is named dissociator (Ds). Likewise, the first transposon to be molecularly isolated was from a plant (Snapdragon). Appropriately, transposons have been an especially useful tool in plant molecular biology. Researchers use transposons as a means of mutagenesis. In this context, a transposon jumps into a gene and produces a mutation. The presence of the transposon provides a straightforward means of identifying the mutant allele, relative to chemical mutagenesis methods.

Sometimes the insertion of a transposon into a gene can disrupt that gene's function in a reversible manner; transposase mediated excision of the transposon restores gene function. This produces plants in which neighboring cells have different genotypes. This feature allows researchers to distinguish between genes that must be present inside of a cell in order to function (cell-autonomous) and genes that produce observable effects in cells other than those where the gene is expressed.

Transposons are also a widely used tool for mutagenesis in Drosophila melanogaster, and a wide variety of bacteria to study gene function.

Gene silencing

Gene silencing is a general term describing epigenetic processes of gene regulation. The term gene silencing is generally used to describe the "switching off" of a gene by a mechanism other than genetic modification. That is, a gene which would be expressed (turned on) under normal circumstances is switched off by machinery in the cell.

Genes are regulated at either the transcriptional or post-transcriptional level.

Transcriptional gene silencing is the result of histone modifications, creating an environment of heterochromatin around a gene that makes it inaccessible to transcriptional machinery (RNA polymerase, transcription factors, etc.).

Post-transcriptional gene silencing is the result of mRNA of a particular gene being destroyed. The destruction of the mRNA prevents translation to form an active gene product (in most cases, a protein). A common mechanism of post-transcriptional gene silencing is RNAi.

Both transcriptional and post-transcriptional gene silencing are used to regulate endogenous genes. Mechanisms of gene silencing also protect the organism's genome from transposons and viruses. Gene silencing thus may be part of an ancient immune system protecting from such infectious DNA elements.

What is RNAi

RNA interference (RNAi) is a highly evolutionally conserved process of post-transcriptional gene silencing (PTGS) by which double stranded RNA (dsRNA), when introduced into a cell, causes sequence-specific degradation of homogolous mRNA sequences. It was first discovered in 1998 by Andrew Fire and Craig Mello in the nematode worm Caenorhabditis elegans and later found in a wide variety of organisms, including mammals.

Mechanism of RNA interference

A. On entering the cell, long dsRNAs act as a trigger of RNAi process.

B. It is first processed by the RNAse III enzyme Dicer in an ATP-dependent reaction.

C. Dicer processes dsRNAs into 21-23 nt short interfering RNA (siRNA) with 2-nt 3' overhangs. siRNA can also be synthesized outside the cell and then be introduced into a cell.

D. The siRNAs are incorporated into the RNA-inducing silencing complex (RISC) which consists of an Argonaute (Ago) protein as one of its main components. Ago cleaves and discards the passenger (sense) strand of the siRNA duplex leading to activation of the RISC.

E and F. The remaining guide (antisense) strand of the siRNA guides RISC to its homologous mRNA, resulting in the endonucleolytic cleavage of the target mRNA

Thursday, July 26, 2007

cath

CATH is a hierarchical classification of protein domain structures, which clusters proteins at four major levels, Class(C), Architecture(A), Topology(T) and Homologous superfamily (H).

Class, derived from secondary structure content, is assigned for more than 90% of protein structures automatically. Architecture, which describes the gross orientation of secondary structures, independent of connectivities, is currently assigned manually. The topology level clusters structures into fold groups according to their topological connections and numbers of secondary structures. The homologous superfamilies cluster proteins with highly similar structures and functions. The assignments of structures to fold groups and homologous superfamilies are made by sequence and structure comparisons.

The boundaries and assignments for each protein domain are determined using a combination of automated and manual procedures. These include computational techniques, empirical and statistical evidence, literature review and expert analysis.

dna databases

DDBJ (DNA Data Bank of Japan) began DNA data bank activities in earnest in 1986 at the National Institute of Genetics (NIG).
DDBJ has been functioning as the international nucleotide sequence database in collaboration with EBI/EMBL and NCBI/GenBank.
DNA sequence records the organismic evolution more directly than other biological materials and ,thus, is invaluable not only for research in life sciences, but also human welfare in general. The databases are, so to speak, a common treasure of human beings. With this in mind, we make the databases online accessible to anyone in the world. The EMBL Nucleotide Sequence Database (also known as EMBL-Bank) constitutes Europe's primary nucleotide sequence resource. Main sources for DNA and RNA sequences are direct submissions from individual researchers, genome sequencing projects and patent applications.

The database is produced in an international collaboration with GenBank (USA) and the DNA Database of Japan (DDBJ). Each of the three groups collects a portion of the total sequence data reported worldwide, and all new and updated database entries are exchanged between the groups on a daily basis. The current database release (Release 91, June 2007), with according Release notes and user manual are available from the EBI servers.
GenBank^® is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences . There are approximately 65,369,091,950 bases in 61,132,599 sequence records in the traditional GenBank divisions and 80,369,977,826 bases in 17,960,667 sequence records in the WGS division as of August 2006.

The complete release notes for the current version of GenBank are available on the NCBI ftp site. A new release is made every two months. GenBank is part of the International Nucleotide Sequence Database Collaboration, which comprises the DNA DataBank of Japan (DDBJ), the European Molecular Biology Laboratory (EMBL), and GenBank at NCBI. These three organizations exchange data on a daily basis.