Table of Contents > Genomics > Serial analysis of gene expression Print

Serial analysis of gene expression


Also listed as: SAGE
Related terms
Future research
Author information

Related Terms
  • Bioinformatics, cancer, cytogenetics, deoxyribonucleic acid, DNA, genome, infectious disease, microarrays, molecular biology, mRNA, nucleotide, PCR, pharmacogenomics, polymerase chain reaction, protein, reverse transcriptase, ribonucleic acid, RNA, sequencing, tags, transcription, transcriptome, translation, tRNA.

  • Serial analysis of gene expression (SAGE) is a method of analyzing the activity of genes in a cell. SAGE is unique in that it works backward from a cell's products to the genes that are producing them, and it not only identifies the active genes but also determines the specific level of activity of each one.
  • Genes, which are located inside the nucleus of cells, are considered the blueprints of life because they provide instructions for the construction and function of all of the cells in the body. They do this by instructing cells to make new molecules (usually proteins) specific to the functions of the cells. Genes are carried on very large molecules called DNA (deoxyribonucleic acid). DNA is a very long and complex chemical in the nucleus of cells that carries a four-letter code, which directs the machinery of the cell to do its work. The "letters" of the code are small chemicals called nucleotides and include adenine (A), thymine (T), guanine (G), and cytosine (C). Consequently the code reads "AATGCGCCTTTGAGGTC" and so on.
  • Three letters in a sequence designate a particular amino acid. Amino acids arranged linearly form a protein. Some of the codes spell out instructions for building every protein in the organism. Proteins are the workers in the body; they do all the construction, make up a good portion of what is built, and perform most of the operations that make up life activities. There are roughly 30,000 proteins in a human body, and about 30,000 protein codes in the DNA. Many other codes in the DNA do not code for proteins. Some, called regulators, turn genes on and off, but most of them are poorly understood at this time.
  • In order to make proteins, the following steps must occur. Chemical signals, such as hormones, turn on specific genes in the DNA. These genes become active, making a molecule called messenger RNA (mRNA) exactly as the DNA directs. This process is called transcription. Next, the mRNA leaves the nucleus for the cytoplasm, where it meets transfer molecules (tRNA) that carry amino acids. There is one tRNA molecule for each of the 20 amino acids that make up living proteins, all of which are simply strings of amino acids. Then, in a process called translation, special assembly lines, similar to those that make automobiles, gather the amino acids from the tRNA molecules and arrange and assemble them into a protein according to the instructions received from the DNA by way of the mRNA.
  • The "genome" is defined as every gene in the cell, whether it is active or not. All the genes that are active at a given time, those that are being transcribed, make up the "transcriptome" of the cell or group of cells. A cell group is a collection of cells, called a "tissue," that are doing the same thing, such as making a hormone. Tissues presumably have the same transcriptome. "Transcriptomics" examines the activity level of genes in a given cell at a particular time. "Serial analysis of gene expression (SAGE)" is a unique method of determining the transcriptome of a cell.

  • The process of serial analysis of gene expression (SAGE) begins by extracting and purifying mRNA from cells to determine their transcriptome. The transcriptome is all of the genes that are active in a cell or group of cells at a given time. The mRNA forms a template for creating matching DNA molecules using an enzyme (protein) called reverse transcriptase.
  • These DNA molecules are then broken into very short pieces of only 9-14 nucleotides long and are strung together to allow multiple tags in a single laboratory procedure. The tags are then exposed to a large number of genes. Only the genes that generated mRNA and produced tags will stick to the tags. In this way the active genes in the cell can be identified.
  • This process can also count the number of mRNA molecules the cells are making, so that it is both qualitative and quantitative. This is possible because the process exactly multiplies the original sample, so that the original ratio of gene products remains the same.
  • The length of these "SAGE tags" determines how specific they are. Shorter tags can pick up unknown genes, which is one of the unique features of SAGE. These unknown genes can then be sequenced, identified, and added to the libraries. Sequencing is the process of identifying, one by one, the string of nucleotides that makes up each gene, in other words, the genetic code. Longer tags can identify every active gene individually and specifically.
  • Once a gene is completely sequenced, its product can be characterized. If the function of that product causes disease, efforts are then directed at inhibiting its function. If the function is beneficial, ways to increase its function are sought.

  • Serial analysis of gene expression (SAGE) is uniquely capable of studying the same cells under different circumstances to see which changes take place in the transcriptome. At present, most research in this area is focused on building libraries of genes and other DNA sequences. Once this is accomplished, researchers will aim to determine what each gene does and what the transcriptome does when functioning properly. After that, researchers try to identify specific genes or gene products critical to some cellular function, for example, the ability of a cancer to spread. At this point, researchers can begin attempts to modify this ability by altering the genes or their products.
  • Cancer: SAGE is a principal tool in studying cancer because it is able to identify the entire transcriptome of a cancer. Because a cancer requires multiple mutations to survive, only a composite approach to all of its abnormal functions can hope to arrest its growth.
  • Disease susceptibility: Due to their genetic makeup, some people are naturally prone to resist certain diseases and to acquire other diseases. SAGE technology is able to identify the many complex genetic interactions that constitute disease susceptibility and resistance.
  • Infectious diseases: The human papilloma virus (HPV) has been shown to play a significant role in the development of cervical cancer. Current efforts to identify the differences between normal cervical tissue and tissue in the early stages of the infection that leads to cancer may help with early diagnosis and techniques for treating this common disease. Using SAGE, it is also possible to track the changes in HIV and other viruses that cause drug resistance to develop.
  • Metabolic diseases: Atherosclerosis requires a complex sequence of genetically controlled reactions to form the abnormal chemicals that lead to heart and vascular disease. SAGE may help unravel these chains of events, leading to better understanding of how the disease progresses and how to interrupt its course.
  • Pharmacogenetics: Pharmacogenetics is the study of how an individual's genetics affect the way he or she responds to a drug. Many drugs require chemical transformations in the body before they can take effect. These transformations are genetically controlled. Even the simple act of absorbing a drug from the digestive tract can be influenced by genetic factors. SAGE analysis of these genetic variations can prevent ineffective treatments.

  • Each methodology in molecular biology has one thing it does better than any other. Serial analysis of gene expression (SAGE) is able to analyze an entire transcriptome in a single pass and is able to isolate new genes without knowing what they are ahead of time.
  • Cancer: Cancer requires that many genes function in specific ways. Up until now, medical science has been able to attack these genes only one at a time, with meager results. Techniques such as SAGE are exposing the entire complex of abnormalities that allow a cancer to flourish. It is hoped that a better understanding of cancer genetics will lead to effective combination treatments.
  • Disease susceptibility: Although the environment plays a major role in which diseases individuals contract, their genetic makeup also has a powerful influence. Identifying which diseases each person is likely to contract will help in prevention.
  • Infectious diseases: Germs (e.g., viruses, bacteria) are in many ways similar to cancers in that their disease-causing capabilities are genetically determined. Understanding why certain germs cause disease and why they are resistant to antibiotics at a genetic level may lead to more effective treatments.
  • Metabolic diseases: Atherosclerosis, diabetes, and many other chronic diseases have genetic components, the discovery of which may lead to better medical treatments.
  • Pharmacogenetics: An active area of research is personalizing drug treatments according to an individual's genetic makeup. This will make it possible to predict which medicines will work best in a person and will replace the current trial-and-error method now used.

  • Serial analysis of gene expression (SAGE) analyzes a cell's entire transcriptome and therefore, cannot be used to identify a subset of functioning genes. It is also unable to deal with multiple samples at once or to directly compare the effects of numerous drugs. Purifying a sample to eliminate multiple cells performing different functions is a challenge for SAGE usage with typical human tissue samples. Tags may be incorrectly sequenced or may not identify certain transcripts that lack a complementary sequence.
  • Human and methodological errors are always possible, so constant vigilance and repetition are required to validate findings using SAGE. Highly complex hardware that needs constant calibration and maintenance is required. Specimen mishandling can alter results, as can errors in setting the initial conditions for the test run and misinterpretation of data.


Future research
  • Gene libraries are expected to expand in the future. These libraries will form the basis for continued discoveries, eventually leading to medical and biological applications such as disease cures and improvements in the food supply. The areas of current research include most of the future topics of interest for decades to come.

Author information
  • This information has been edited and peer-reviewed by contributors to the Natural Standard Research Collaboration (

  1. Genetics Home Reference (GHR). Accessed July 10, 2008.
  2. Kronstad JW. Serial analysis of gene expression in eukaryotic pathogens. Infect Disord Drug Targets. 2006 Sep;6(3):281-97.
  3. Li J, Chen YG, Kong XY. [New progress of serial analysis of gene expression] Sheng Wu Gong Cheng Xue Bao. 2001 Nov;17(6):613-6.
  4. Marti J, Piquemal D, Manchon L, et al. [Transcriptomes for serial analysis of gene expression]. J Soc Biol. 2002;196(4):303-7.
  5. National Human Genome Research Institute (NHGRI). .
  6. Natural Standard: The Authority on Integrative Medicine. .
  7. Oue N, Aung PP, Mitani Y, et al. Genes involved in invasion and metastasis of gastric cancer identified by array-based hybridization and serial analysis of gene expression. Oncology. 2005;69 Suppl 1:17-22.
  8. Patino WD, Mian OY, Hwang PM. Serial analysis of gene expression: technical considerations and applications to cardiovascular biology. Circ Res. 2002 Oct 4;91(7):565-9.
  9. Polyak K, Riggins GJ. Gene discovery using the serial analysis of gene expression technique: implications for cancer research. J Clin Oncol. 2001 Jun 1;19(11):2948-58.
  10. Riggins GJ. Using serial analysis of gene expression to identify tumor markers and antigens. Dis Markers. 2001;17(2):41-8.
  11. Tuteja R, Tuteja N. Serial analysis of gene expression (SAGE): application in cancer research. Med Sci Monit. 2004 Jun;10(6):RA132-40.
  12. Tuteja R, Tuteja N. Serial analysis of gene expression (SAGE): unraveling the bioinformatics tools. Bioessays. 2004 Aug;26(8):916-22.
  13. Yasui W, Oue N, Ito R, et al. Search for new biomarkers of gastric cancer through serial analysis of gene expression and its clinical implications. Cancer Sci. 2004 May;95(5):385-92.
  14. Ye SQ, Usher DC, Zhang LQ. Gene expression profiling of human diseases by serial analysis of gene expression. J Biomed Sci. 2002 Sep-Oct;9(5):384-94.

Copyright © 2011 Natural Standard (

The information in this monograph is intended for informational purposes only, and is meant to help users better understand health concerns. Information is based on review of scientific research data, historical practice patterns, and clinical experience. This information should not be interpreted as specific medical advice. Users should consult with a qualified healthcare provider for specific questions regarding therapies, diagnosis and/or health conditions, prior to making therapeutic decisions.

Search Site