Genomics quiz 4?

Card Set Information

Author:
johnpc
ID:
310457
Filename:
Genomics quiz 4?
Updated:
2015-11-17 20:16:36
Tags:
genomics
Folders:

Description:
genomics quiz
Show Answers:

Home > Flashcards > Print Preview

The flashcards below were created by user johnpc on FreezingBlue Flashcards. What would you like to do?


  1. What is qPCR and why useful?
    • -Measures quantity of DNA present during PCR.
    • -Especially useful when little DNA/mRNA is present.
    • -Method is more sensitive than Northern blots.
  2. Traditional approaches for quantifying mRNA
    • – The Northern blot
    • – Wholemount in situ hybridization
    • – Ribonuclease protection assay
    • – Conventional PCR
  3. Why not use traditional PCR methods to
    quantitate mRNA?
    • Only determines quantity of DNA at end of reaction.
    • -Size-based discrimination only
    • -Ethidium bromide/Gel Red are poor
    • stains for quantification
    • -Can only detect 10-fold
    • changes, but qPCR can
    • detect 2-fold changes.
  4. Advantages of qPCR
    • -Measures product after each round of DNA amplification
    • by detecting accumulation of fluorescence.
    • -More accurate for quantifying amount of target present than
    • conventional PCR quantification techniques.
    • -qPCR easier than conventional PCR quantification techniques.
  5. Limitations of qPCR
    • -Expensive
    • -Does not give information on spatial distribution
    • of target.
  6. Examples of qPCR uses
    • -Measuring gene expression
    • -Verifying RNA-seq or microarray results
    • -Measuring DNA damage
    • -Genotyping
    • -Detecting pathogens
    • -Measuring viral load
  7. Cycle number/amount of DNA relationship
    Relationship is a straight line on a logarithmic scale
  8. Threshold
    • Point at which
    • a reaction reaches
    • fluorescent intensity above
    • background. Set in
    • exponential phase of
    • amplification for most
    • accurate reading.
  9. Cycle Threshold, CT
    • The
    • cycle at which the sample
    • reaches threshold.
    • Depends on how much
    • template was available at
    • start of reaction. If little
    • present, it will take more
    • cycles to reach CT.
  10. Baseline
    Portion of curve before exponential changes are detectable.
  11. PCR Reaction steps
    • Denature
    • Annealing
    • Extend
  12. SYBR® Green
    • -Used to monitor DNA synthesis
    • -Fluoresces more when bound to dsDNA
    • -Better than ethidium bromide because…
    • -Brighter
    • -Greater difference in amount of
    • fluorescence in dsDNA compared
    • to ssDNA
  13. Advantages of the SYBR® Green approach?
    • •Less expensive than other methods.
    • •Doesn’t require design of specialized primers.
    • •Good method if you plan to look for different
    • target sequences each time you do qPCR.
  14. Problems with the SYBR® Green approach?
    • •SYBR® Green binds any dsDNA and may bind
    • bind primer dimers (primers binding to each other
    • instead of binding template). This would cause an
    • inaccurate reading.
  15. TaqMan® Assays
    • -Uses a short probe
    • complementary to sequence of
    • interest.
    • ….5’ end has a reporter
    • fluorescent dye and 3’ end
    • has a quencher dye.
    • -If probe intact, quencher
    • reduces fluorescence of
    • reporter by fluorescence
    • resonance energy transfer
    • (FRET).
    • -Probe binds cDNA
    • downstream of forward or
    • reverse primer and extension
    • begins.
    • -As DNA is replicated, Taq
    • polymerase moves forward. Once
    • it reaches TaqMan® probe, 5’
    • exonuclease activity of Taq
    • displaces the reporter, cleaving it
    • from the probe.
    • -When separated from the
    • quencher, the reporter fluoresces.
    • Amount of fluorescence is
    • proportional to DNA replicated.
  16. Problems with the TaqMan® approach?
    • •Expensive.
    • •Requires knowledge of target sequence for
    • probe design.
  17. Advantages of the TaqMan® assay?
    • •Only binds specific sequence of interest, so
    • don’t have quantification of non-specific products due to
    • primer dimer formation.
    • •Good approach if performing multiple qPCR reactions
    • and always searching for same target.
  18. Molecular Beacons
    • •Probe consists of a
    • fluorescent reporter and
    • a quencher.
    • •When not bound to DNA,
    • the probe doesn’t
    • fluoresce because the
    • molecular beacon forms
    • a stem-loop structure
    • and the reporter and
    • quencher lie close to
    • each other
    • •When DNA
    • denatures during
    • PCR, molecular
    • beacon binds
    • ssDNA.
    • •Reporter fluoresces
    • because quencher
    • is farther away
  19. Problems with molecular beacons?
    • •Expensive.
    • •Doesn’t depend on exonuclease activity of
    • DNA polymerase.
    • •Requires knowledge of target sequence for
    • probe design.
  20. Advantages of molecular beacons?
    • •Only binds specific sequence of interest, so
    • don’t have primer dimer problem or quantification
    • of non-specific products.
    • •Good approach if want to perform multiple qPCR
    • reactions searching for same target each time.
  21. Prepare samples for qPCR
    • Isolate Tissue
    • Extract RNA
    • Convert to cDNA
    • qPCR
    • Analyze
  22. qPCR controls
    • -No template control (Negative control)
    • •No DNA added to reaction
    • •Checks reagents for contamination
    • -No reverse transcriptase control (Negative control)
    • •No reverse transcriptase added when made cDNA.
    • •Signal indicates genomic DNA contamination.
    • -Plasmid containing known insert (Positive control)
    • •Checks that reagents and primers work
    • •Especially important if trying to show absence of
    • gene expression
  23. Testing efficiency of qPCR rxn
    • 1- The annealing reactions
    • 2- The melt curve analysis
    • 3- The dilution series
    • 4- The standard curve
  24. Annealing temperature optimization
    • -Determine which temperature
    • makes the most product.
    • -Shouldn’t take too many cycles
    • (less than 28).
    • -Remember a low CT value means that more DNA is present in sample.
    • -Primers performing efficiently will have a lower CT (preferably below 28).
  25. Melt curve analysis
    Do my primers specifically amplify one sequence?

    • Primer
    • dimers are present in this sample as indicated by the second peak to the left.
  26. Dilution series
    • How accurate is my sample prep?
    • -The dilution series allows you to determine whether you are pipetting consistently
    • from one sample to the next and to assess the efficiency of your assay
    • The resulting line should have a correlation coefficient between 0.990 - 1.05.
  27. Proteome
    • The entire set of proteins
    • produced by a cell, tissue, organism, or genome
  28. Proteomics
    • A branch of biotechnology
    • concerned with applying the techniques of
    • molecular biology, biochemistry, and genetics to
    • analyzing the structure, function, and/or
    • interactions of proteins comprising the proteome.
  29. Antibody
    • -Produced by B-cells
    • -Can be soluble or
    • membrane-bound
    • -Identify foreign materials
    • (bacteria, viruses, etc.)
    • and either tag for
    • destruction by immune
    • system or directly block
    • site on microbe needed
    • for invasion.
    • Contain:
    • -Polypeptide chains
    • (2 light and 2 heavy)
    • -Light chain has a
    • constant and a variable
    • domain
    • -Heavy chain has 3
    • constant domains
    • and 1 variable
    • -Variable regions bind
    • antigens
  30. Epitope
    • Structural features of the
    • antigen that allow it to be bound by
    • antibody. Most molecules have
    • several epitopes.
  31. Non-fluorescent labeling of antibodies
    -Alkaline phosphatase or horseradish peroxidase
  32. Types of protein microarrays
    • antibody chip
    • antigen chip
  33. Types of functional protein microarrays
    • protein-protein
    • protein-liposomes
    • protein-drug
    • enzyme-substrate
  34. ChIP on chip analysis
    • -Chromatin immunoprecipitation (“ChIP”) with microarray technology (chip).
    • -A method for detecting DNA sequences bound by a protein of interest.
    • -Sample question: Which promoter sequences are bound by
    • transcription factor X?
  35. ChIP-chip vs ChIP Seq
    microarray / RNA seq
  36. 2-D PAGE
    • Step 1: Isoelectric focusing
    • •Protein complexes separated by charge
    • •Electrophoresis performed in
    • a pH gradient
    • •Each migrates to isoelectric
    • point (where its pI value equals
    • surrounding pH and net charge is 0).
    • Step 2: Size fractionation
    • •Proteins separated by size
    • •Place isoelectric focusing gel in SDS
    • solution.
    • •SDS separates proteins and confers
    • uniform negative charge.
    • •Perform electrophoresis (PAGE: Polyacrylamide gel electrophoresis). Proteins
    • migrate towards positive pole, smallest move farther.
  37. How is protein conformation
    determined?
    • -External physical forces also affect protein conformation
    • • Solvation (Which ions or molecules will it associate
    • with?)
    • • Membrane components (Is the aa sequence
    • hydrophilic or hydrophobic?)
    • -Internal forces are also important
    • • Disulfide bridging
    • • Hydrogen bonding
    • • Dipolar interactions
    • • Van der Waals attraction
  38. Chaperonins
    • -Chaperonins are proteins that
    • facilitate folding of other
    • proteins.
    • -Changes in temperature
    • or pH affect how protein folds.
    • -Protein may get stuck in a
    • suboptimal conformation.
    • -Chaperonins use ATP to fold
    • protein back to its native state.
  39. How can we determine protein
    structure?
    • -Directly measure
    • • X-ray crystallography
    • • NMR spectroscopy
    • • Cryoelectron microscopy
    • -Mutate a specific site in the protein
    • (Site directed mutagenesis)
    • -Computer modeling
  40. Synchrotrons advantage
    • •1,000 times more intense than traditional “in-house”
    • x-ray sources.
    • •Allows user to “tune” the X-ray wavelength so can collect high-quality
    • diffraction data much faster (in hours) using very small crystals.
  41. X-ray crystallography advantage
    • • Gives highest spatial resolution for proteins.
    • • First developed in early 20th century, hence very
    • mature.
  42. How X-ray crystallography works
    • Shoot X-rays at the protein.
    • • X-rays are scattered by electron
    • clouds surrounding atoms.
    • • Pattern of scattered X-rays used to
    • determine position of atoms in
    • molecule.
    • • Can measure position of atoms
    • within 2 angstroms.
  43. Synchrotrons disadvantage
    • Require lots of space (some
    • approach 1 mile circumference).
    • • Very expensive
  44. Protein Data Bank Most Abundant Source
    X-ray crystallography
  45. Xray
    crystallography disadvantage
    • • Structures revealed not entirely complete
    • (Difficult to distinguish carbons from oxygen or
    • nitrogen…must be inferred from rest of the protein
    • structure).
    • • Crystal structure may not equal in vivo structure.
    • • Can’t analyze membrane-bound proteins because
    • they are in a hydrophobic environment and can’t
    • crystallize (50% of drug targets, 30% of human proteome).
  46. Analysis of NMR data
    • gives
    • distances between pairs of
    • atoms and suggests a limited
    • number of possible
    • structures
    • Final 3-D structure
    • • An average of possible
    • models
    • • Model has plausible
    • bond angles and
    • distances
  47. NMR Advantages
    • • Does not require crystallized protein.
    • • Can determine protein structure under reasonable
    • physiological conditions: aqueous solution at 30°C.
    • • Good way to test possible protein ligands.
    • • Allows measurement of protein “foldedness.”
  48. NMR disadvantages
    • Requires high protein concentrations (~1mM).
    • • Much of data analysis is still done manually.
    • • Only works well on small proteins (less than
    • 25–30 kDa).
  49. Cryoelectron microscopy
    • -Similar to X-ray crystallography except done in a
    • cryoelectron microscopy…bombarded with electrons
    • instead of X-rays.
    • -Sample only needs to be frozen in a thin layer of buffer,
    • does not need to be crystallized.
    • -A diffraction pattern is generated using an electron
    • beam.
    • -Trade-off: More electron bombardment means better
    • images, but also damage to specimen.
    • -Data analyzed much like X-ray diffraction data
  50. Cryoelectron microscopy advantages
    • • Proteins do not need to be crystallized
    • • Can be applied to membrane-bound proteins
    • • Used to determine structure of large protein
    • ensembles
  51. Cryoelectron microscopy disadvantages
    • Low resolution compared with NMR and X-ray
    • crystallography
  52. Comparative genomics
    • Study of the relationship of
    • genome structure and function across different
    • biological species or strains.
  53. Why study comparative genomics?
    • Gain insights into evolutionary trends
    • -How have genomes evolved?
    • -Origins of major taxa
    • • Identify characteristics unique to a species
    • • Identify conserved characteristics.
  54. range of metabolic
    activities found in different microbes
    • Aerobic
    • • Anaerobic
    • • Photosynthetic
    • • Sulfur fixing
    • • Nitrogen fixing
  55. Economics and microbes
    • Brewing and fermenting
    • (oldest use of microbes by humans)
    • • Production of organic compounds
    • -Ethanol, amino acids
    • -Antibiotics
    • • Recombinant DNA products
    • -Human insulin
    • -Interferon (used to treat cancer, hepatitis C)
    • • Isolation and production of enzymes
    • -Taq polymerase
    • -Proteases in modern detergents
    • • To break down sewage
    • • Cleaning up oil spills
  56. 1st virus sequenced:
    1982, Bacteriophage lambda (48.5 kb)
  57. 1st bacterium sequenced:
    • 1995, Haemophilus influenzae Rd
    • (1.8 Mb)
  58. 1995, Haemophilus influenzae Rd
    (1.8 Mb)
    • • Few non-coding regions (except in eukaryotes)
    • • Coding regions
    • -25% of genes appear unique to species
    • -40–50% code for proteins with unknown functions
    • - # of genes acting in transcription and translation usually constant
    • - # of genes active in other functions varies tremendously
    • A single species can have great genomic diversity
    • -Up to 22% sequence difference between different strains of same
    • bacterial species
    • Horizontal/lateral gene transfer common
  59. COGs
    • clusters
    • of orthologous proteins
    • proteins from:
    • • Different species
    • • Share a common ancestor
    • • Likely to share similar function
  60. Orthologues
    Two homologues separated by speciation.
  61. Homology:
    Similarity due to shared ancestry.
  62. Paralogues
    • Two homologues separated by a gene duplication
    • event that originally occurred within a single species.
  63. • Minimal requirements for a COG?
    • • A group of at least 3 orthologous proteins
    • from distantly related genomes.
    • • These protein sequences are more similar to
    • each other than they are to any other protein
    • from their own genome.
  64. Why classify COGs?
    • Thought to include genes with similar function.
    • •Used to reveal phylogenetic relationships at
    • genome-wide level
    • •Membership in different COGs used to
    • functionally characterize organisms.
    • •COGs were particularly useful in microbial
    • genomics because many microbial genomes had
    • been fully sequenced.
  65. How does lateral gene transfer occur?
    • Transformation
    • • A microbe can absorb foreign DNA
    • • Conjugation
    • • One-way transfer of DNA
    • • Transduction
    • • Accidental exchange of DNA
    • between two cells via viruses.
    • • Phagocytosis of bacteria (in
    • eukaryotes)
    • • DNA transfer from organelles
    • to the host’s chromosomes
    • (endosymbionts like
    • mitochondria and
    • chloroplasts
  66. Detecting lateral gene transfer
    • Look for differences in nucleotide composition
    • between neighboring genomic regions (e.g. %GC)
    • • Presence of rare genes in distantly related genomes
    • Produce phylogenies…Do they change when laterally
    • transferred region used to generate tree?
  67. Primary (P-)endosymbionts
    • A class of bacteria that live as symbionts inside cells of
    • host organisms.
    • • Cannot exist as free-living organisms
    • • Host dies without them
    • • ~10% of insects rely on P-endosymbionts
  68. Dominant negative
    have an altered gene product that acts antagonistically to the wild-type allele
  69. how RNAi is useful
    • dsRNA can occur because:
    • *Acquired from viruses that
    • have a dsRNA intermediate
    • in life cycle
    • *Retrotransposons integrate
    • in different orientations. If sense
    • and antisense strands are made
    • (from different regions) they can
    • bind to make dsRNA. RNAi
    • detects the dsRNA and silences
    • expression of the transposon
    • gene products.
  70. synteny
    • arrangement of genes
    • similar in several cases
  71. Evolution of organelles in eukaryotes
    • Some eukaryotic organelles
    • possess minimal genomes
    • • Mitochondria
    • • Chloroplasts
    • • Endosymbiotic bacteria
    • became organelles
    • • Some endosymbiont genes
    • were transferred to the
    • nucleus, others lost.
    • Chloroplasts? Initial endosymbiont was a cyanobacterium
  72. Example of a P-endosymbiont: Buchnera
    • • Found in aphids
    • • Provides essential amino acids
    • • Concentrated in large cells called bacteriocytes
    • • Maternally transmitted in aphids
    • Isolation within host cells led to genomic stasis in Buchnera
    • • No phage or insertion sequences
    • • No evidence of past lateral gene transfer
    • • Few repetitive sequences…make sequence more stable (less mutation)
    • Lacks many genes for cell
    • surface proteins
    • • Genes for several regulatory
    • pathways are missing
  73. Standard Curve
    • -Data from the dilution series is used to generate a standard curve. The
    • CT values are plotted versus the log of the starting quantity of DNA.
    • - The resulting line should have a correlation coefficient between 0.990 - 1.05.
    • This would indicate that the qPCR reactions are linear and that an increase
    • in starting DNA content corresponds to a proportional increase in fluorescence.
  74. What if the standard curve correlation coefficient is less than 0.990 or more than 1.05?
    • This means
    • there is more variability in the reaction than preferred. May need to redesign
    • the qPCR primers/probes or increase pipetting efficiency
  75. qPCR components
    • Water, buffer, MgCl2
    • cDNA
    • F, R Primers
    • Probe/mastermix (polymerase, dNTPs)
  76. Positive control in qPCR
    • a plasmid containing the gene of
    • interest is available, this can be used as the template instead of cDNA. Amplified
    • products in the experimental samples should correspond to those in the positive control if
    • the same product is being amplified in both.
  77. No Template control
    • to see if the reagents are contaminated and to detect
    • primer dimer problems
  78. No RT control
    • to see if product is amplified
    • when no cDNA is produced.
  79. No RT Control components
    • Water, buffer, MgCl2
    • B-tubulin
    • F, R Primers for B-tubulin
    • Probe/mastermix (polymerase, dNTPs)
  80. Dilution series
    • 1,
    • 1/10, 1/100, 1/1,000, 1/10,000, 1/100,000
  81. Antigen
    any substance that causes your immune system to produce antibodies against it.
  82. antibody components
    2 light and 2 heavy chains
  83. Immunohistochemistry
    Antibody staining
  84. isoelectric focusing
    • step in 2D PAGE 
    • •Protein complexes separated by charge
    • •Electrophoresis performed in
    • a pH gradient
    • •Each migrates to isoelectric
    • point (where its pI value equals
    • surrounding pH and net charge is 0).
  85. NMR
    • -Some atoms (1H, 13C, 15N) have intrinsic magnetic properties and
    • can switch between magnetic spin states.
    • -Electromagnetic radiation can be used to induce the spin state to flip.
    • -Can record amount of electromagnetic energy absorbed during this
    • process to learn about protein structure.
    • -Amount of energy needed to change spin, depends on proximity of
    • neighboring atoms and types of chemical bonds present.
    • -By using stronger magnets in NMR, finer levels of energy absorption
    • can be detected.
  86. isoelectric
    point
    • where its pI value equals
    • surrounding pH and net charge is 0
  87. Synchrotrons
    • -Provides a high-intensity X-ray source.
    • -A synchotron consists of:
    • • Electron gun: source of electrons.
    • • Linear accelerator and circular booster ring: accelerate electrons to 90%
    • speed of light.
    • • 2 Storage rings
    • • Beamlines (60 X-ray beamlines in Brookhaven): direct x-rays.
    • -Magnetic and electric fields are optimized so that electrons accelerate in a circle
    • around storage rings, emitting high
    • intensity X-rays that are directed by the
    • beamlines.
  88. Ways to detect antibodies
    • Fluorescent (biotin)
    • Non-fluorescent (Horseradish peroxidase)
  89. The Protein Data Bank
    • -The PDB contains 3-D structural data for proteins, nucleic acids,
    • and carbohydrates.
    • -Managed by scientists from the San Diego Supercomputing
    • Center, Rutgers University and NIST.
    • -Founded in 1971 with 12 structures.
    • …October 8, 2013 = 94,540
    • -All structure files are reviewed by PDB staff for accuracy and data
    • uniformity
    • -Structural data from the PDB can be freely accessed at
    • http://www.rcsb.org/pdb/
  90. antibody structure
  91. Immunoprecipitation
    capture specific protein on column using antibodies
  92. Direct vs indirect immunohistochemistry
  93. 2-D polyacrylamide gel electrophoresis
    (2-D PAGE)
    • -Widely used technology for
    • protein separation
  94. How to compare related gels in 2-D PAGE
    • -Gel-matching software
    • (MELANIE II, CAROL)
  95. Analytical protein microarrays
    • -A high density of affinity reagents are spotted on
    • the microarray chip (e.g. antibodies or antigens).
    • -Protein or antibody added and see which bind to affinity
    • reagent.
  96. amino acid structure
    • Bonds attached to central carbon can rotate, bonds between amino and carboxyl group cannot (rigid)
  97. To determine how many genes were necessary for
    survival
    single-gene knockdowns
  98. Gene knockdowns
    Any manipulation that reduces expression of a specific gene.
  99. Gene knockdowns approaches
    • -Overexpress a dominant negative
    • -Inject a morpholino
    • -Homologous recombination
    • -RNAi
  100. Overexpress a dominant negative Example
    • Tyrosine kinase receptor
    • Inject mRNA that will be translated into a mutant subunit protein
    • that lacks the site needed to initiate signaling
    • Ligand binds to dimer, but no signaling activated
  101. Morpholinos
    • •Like RNA but ribose replaced with morpholine group;
    • phosphodiester backbone modified.
    • •Resists degradation by cell.
    • •Complementary to RNA sequences.
  102. Translation blocking morpholinos
    • Binds near initiation site (overlapping portions of the 5’UTR
    • and the first exon), interfering with ribosome assembly
    • and prevents translation.
  103. Spliceblocking morpholinos
  104. Homologous recombination (knockout)
    • Relies on recombination between a vector carrying positive and negative
    • selectable markers.
    • Recombination occurs between identical sequences in vector and chromosomal DNA.
  105. Often organisms appear not to “need” many of their genes
    -Evidence?
    • -If you knockdown or knockout any single gene in a mouse, most of
    • the time the mouse will survive.
    • -Only about 30% of the genes are necessary for survival!
    • -In yeast, more than 4,700 single-gene knockouts were performed in
    • homozygous diploid lines. Only 10.7% exhibited reduced growth/
    • viability. Growth of 83.5% of the knockouts was unaffected
  106. redundancy refers to pairs of homologous genes
    • with functional overlap where one can compensate
    • for loss of the other.
  107. Sources of genetic redundancy?
    • -Gene duplication
    • -Genome duplication
    • -Convergent evolution
  108. Implications of high occurrence of redundancy in signaling components?
    • •Functional overlap in redundant genes may be beneficial in
    • maintaining ability to signal.
  109. Examples of redundant signaling proteins?
    • -Hox genes
    • -Wnt proteins
    • -Myogenic regulators (MyoD, Myf5, myogenin, Mrf4)
  110. -In study of 59 pairs of redundant genes (yeast), the
    redundant forms were
    • not expressed at the same
    • time and/or place. Expression patterns didn’t overlap.
  111. In vertebrate developmental pathways, redundant
    duplicates are
    • expressed in spatially or temporally
    • distinct areas.
  112. Knocking out one gene often results in
    • upregulation of the
    • redundant partner.
  113. Redundancy may allow
    compensation when one isoform lost
  114. somites
    • -Early in vertebrate development, segmented blocks of tissue appear on either side of the nerve cord
    • -The somites form muscle, tendons, endothelial cells, dermis, and cartilage.
  115. Portions of the somite forming
    adult skeletal muscle divided
    into two domains…
    • -Epaxial: dorso-medial region (ep)
    • -Hypaxial: ventrolateral region (hyp)
  116. Redundancy may allow adaptation to
    local conditions
  117. Redundancy can be used to increase ability of cell
    • to sense changes in
    • environment and respond.
  118. Redundancy may improve
    processing of external info
  119. Gene redundancy may provide important opportunities
    for the organism…
    • -Compensating for loss of another molecule
    • -Adapting to changing external factors
    • -Improved processing of external information
  120. -How are new gene families created?
    A) Divergent evolution

    B) Concerted evolution

    C) Birth-and-death evolution
  121. Divergent evolution
    • A group of genes is duplicated.
    • -Evolution of gene A doesn’t
    • affect evolution of gene B.
    • -Each gene gradually diverges
    • as mutations accumulate.
    • -Duplicate genes assume new
    • functions.
    • Ex: alpha and beta globins
  122. Concerted evolution
    • -Gene family members do not evolve
    • independently. Evolve at same time.
    • -In ribosomal RNA of frogs, find that
    • intergenic regions more similar in
    • different rRNAs of same species than
    • in two related Xenopus species.
    • -Why? If one gene acquires a mutation,
    • it spreads to the other rRNAs by
    • unequal crossing over or by
    • nonreciprocal recombination.
    • Ex: 5S rRNA in Xenopus
    • Primate U2 snRNA
  123. Birth-and-death evolution
    • -New genes made by gene duplication.
    • -Some remain, others become
    • pseudogenes or are deleted.
    • -Pattern in phylogenetic tree is more
    • difficult to interpret.
    • Ex: Major histocompatibility complex genes
    • T-cell receptors
    • MADS-box genes
    • Ubiquitins, etc.
  124. The domainome
    • Instead of focusing on genes, look at evolution
    • of protein domains.
    • All the domains from a dataset is the “domainome” for that
    • group of sequences.
  125. How can the g-value paradox be explained?
    • -More potential combinations between proteins
    • -Multifunctional genes
    • -Alternative splicing
    • -Transcriptional control
    • -Posttranslational modification
    • -Roles of non-coding RNA
  126. Dollo parsimony
    • assumes that once a complex trait is lost in a lineage, it cannot be
    • regained
  127. Domains acting in cell regulation
    • increased during
    • eukaryotic evolution.
  128. Domains related to metabolism
    • decreased during
    • eukaryotic evolution.
    • Suggests that roles of metabolic domains may be taken over by symbionts
  129. High number of domains in basal groups supports idea that
    • the last
    • common ancestor of eukaryotes was complex.
  130. Class I elements
    retrotransposons
  131. Class II elements
    DNA transposons
  132. Insertion of transposable elements can harm the host by
    • Insertional mutagensis, chimeric transcript production, antisense effects, and illegitimate recombination
  133. How does the genome defend itself from transposable elements?
    • RNAi
    • MircroRNA
    • Cytosine methylation
    • Defensive mutagenesis
  134. RNA interference
    (RNAi)
    • 1- In cytoplasm, Dicer binds
    • dsRNA.
    • 2- Dicer cleaves dsRNA to
    • form small interfering RNAs
    • (siRNAs, 21-23 bp long w/2 bp
    • overhang at 5’ end).
    • 3- The RNA-induced silencing
    • complex (RISC) forms when the
    • antisense strand of siRNA
    • associates with Argonaute 2
    • (AGO2) protein. May also include
    • other protein types.
    • 4- RISC complex scans RNAs
    • to find complementary
    • sequence to siRNA. siRNA
    • binds sense strand of
    • target mRNA and RISC
    • complex cleaves target.
  135. MicroRNAs
    • 1- MicroRNA is transcribed from DNA.
    • 2-These short RNA sequences form
    • haipin loops and are transported to
    • the cytoplasm by exportin5.
    • 3- In the cytoplasm, Dicer trims the
    • dsRNA (22 bp seq with overhang).
    • 4- miRNAs silence gene expression by
    • binding complementary regions in
    • the 3’ UTR of target mRNAs
    • (animals) or by binding coding
    • regions of target mRNAs (plants).
    • Base pairing is often partial and binding of the miRNA affects multiple mRNA types.
    • Effects? Inhibiting translation, causing loss of poly-A tail, interfering with methylated cap/
    • poly-A tail interactions, or causing mRNA degradation by exonucleases.
  136. Cytosine methylation
    • -Bacteria use DNA methylation to protect the genome from
    • degradation by restriction enzymes.
    • -Restriction enzymes cannot destroy methylated restriction sites
    • but can cleave unmethylated restriction sites.
    • -Endogenous methyltransferases attach methyl groups to
    • cytosines or adenines.
    • -In eukaryotes, modified 5-methyl-
    • cytosine is made by adding a methyl
    • group to the 5 position of cytosine.
    • -Causes transcriptional silencing when
    • promoter region methylated. Good for
    • long-term silencing. Doesn’t appear to
    • be reversible.
    • -Repetitive DNA in plants and
    • mammals is usually methylated.
  137. Defensive mutagenesis
    • -A method to block retroviral replication.
    • -In primates, during reverse transcription,
    • the host protein APOBEC3G is incorporated
    • in 1st strand cDNA.
    • -It deaminates cytosines in the retrovirus
    • cDNA strand, converting them to uracils.
    • -During second strand synthesis, uracil is
    • recognized as thymidine, so adenine is
    • inserted in the new DNA strand.
    • -Deactivates virus by mutating up to 25%
    • of cDNA guanine residues.
    • -Not as effective in HIV-1 retrovirus. This
    • virus makes Vif (viral interference factor)
    • which inhibits activity of the APOBEC3G protein.
  138. WHY Protein alignments are more useful if you’re comparing distantly
    related sequences.
    • -Peptide sequences are more likely to be conserved than nucleotide
    • sequences since there are multiple codons for the same amino acid.
    • - Some amino acids have similar biophysical properties. Similarities can
    • be accounted for in the protein scoring matrices.
    • -Similarities arising early in the evolutionary process may be detectable
    • using a protein alignment, but not a DNA alignment.
  139. WHY Nucleotide alignments can be more useful if you’re comparing closely
    related sequences.
    • -If sequences are very closely related, the amino acid sequences may
    • not vary much. Nucleotide sequence will vary more, allowing easier
    • detection of differences.
  140. Global Sequence Alignment
    • -Tries to align two sequences along entire length.
    • -Best for highly similar sequences of same length.
    • -As similarities decrease, misses important relationships.
  141. Local Sequence Alignment
    • -Looks for the most similar regions in sequence instead
    • of trying to align entire length.
    • -May return more than 1 result if there is more than
    • 1 subsequence in common.
    • -Good method to use if sequences differ in length or
    • share partial similarity.
  142. Dot plots
    • -A simple way to visually compare 2 sequences to
    • find local alignments, direct repeats, inverted
    • repeats, insertions, deletions or low-complexity regions
  143. Why aren’t dot plots used more?
    Do not provide a measure of statistical similarity
  144. Blosum62 matrices
    • -Score is the log of an odds ratio. Considers how often, in nature,
    • a particular residue is substituted for another versus how often
    • this substitution would occur if by random chance.
    • Si,j = log [ ( qi,j ) / ( pi
    • pj ) ]
    • Si,j is the score for a replacement of residue i with residue j.
    • qi,j represents how often the two amino acids align with each
    • other in multiple sequence alignments of protein groups.
    • pi
    • is the probability that residue i will occur among all proteins
    • pj is the probability that residue j will occur among all proteins
  145. BLAST
    • Relies on “seeding”. Looks for a short query word. Finds this and
    • related words.
    • -To score related words, uses a scoring matrix called “The neighborhood.”
    • -Threshold setting controls how many options allowed in the neighborhood.
    • -Then performs local alignment. Extends until gaps and mismatches decrease
    • score below the score threshold (S)…This info recorded by BLAST.
  146. BLAST
    Problems?
    • -If enter a sequence from a low complexity region can
    • inflate BLAST scores.
    • (Masking with DUST or SEG counter this.)
    • -The hit list contains entries that represent hypothetical
    • proteins.
    • -Hits to ESTs should be treated with caution. Sequencing
    • accuracy is lower than in “finished”
    • sequences.
  147. How does BLAST determine length
    of alignment?
    • Program measures cumulative
    • score as alignment extended
    • •If angle of drop off after a peak
    • exceeds a threshold value (X),
    • extension terminated and trims
    • alignment to preceding peak in
    • curve.
    • •HSP = high-scoring segment pair
    • and is the trimmed alignment •Then calculates E value to determine if alignment
    • is significant.
  148. acidophiles
    extremophiles that can survive pH as low as 0.7
  149. alkaliphiles
    extremophiles survive pH 12.0
  150. Archaea
    • Only unicellular
    • • Present
    • • Circular chromosome and
    • plasmids
    • • Eukaryote-like RNA/DNA
    • polymerases
    • • TATA boxes
    • • Unusual membranes
    • • Most biosynthetic pathways
    • more like bacteria
    • • Absent
    • • Nuclear membrane
    • • Organelles
  151. Bacteria
    • All are unicellular
    • • Present
    • • Circular chromosome
    • and plasmids
    • • Absent
    • • Organelles
    • • Nuclear membrane
  152. Eukarya
    • • Both unicellular and
    • multicellular members
    • • Present
    • • Nuclear membrane
    • • Organelles
  153. exobiology
    • The study of life outside of the terrestrial
    • biosphere
  154. extremophiles
    • can survive
    • conditions that would kill other
    • organisms!
  155. Halophiles
    • 10% salt
    • concentration
  156. Pizophiles
    • pressures > 1
    • atmosphere
  157. Psycrophiles
    < 5°C
  158. -Thermophiles:
    50–113°C
  159. Prokaryotes
    • Few non-coding regions
    • A single species can have great genomic diversity
    • Horizontal/lateral gene transfer common
  160. Use of identifying COGs
    • ways to identify
    • functions of novel sequences
  161. Why rRNA used in constructing universal tree of life
    RNA universal among all life
  162. Challenges to the universal tree of life
    lateral gene transfer
  163. Horizontal/lateral gene transfer
    transfer of genetic material between different evolutionary lineages
  164. Why perform multiple sequence alignments?
    • -To detect mutations, insertions, or deletions in similar sequences
    • -To find domains
    • -To identify secondary structure (alpha helices, beta pleated sheets, etc.)
    • -Useful to display in a publication
    • -To examine evolutionary relationships or establish homology
    • (Multiple sequence alignment Phylogenetic analysis)
  165. Examples of programs:
    • Clustal Omega
    • Tcoffee
    • Muscle
  166. Phylogenetic analysis for…
    • -Gene family identification
    • -Inferring gene functions
    • -Finding origins of a genetic disorder
    • -Determining evolutionary relationships
  167. Phenetics analysis
    • -Classifies based on similarity alone
    • -Does not consider evolutionary relationships of species
  168. Disadvantages of phenetics
    • -May miscategorize organisms based on similarities due to
    • convergent evolution
    • -May give misleading results in closely related species that
    • acquired phenotypic differences rapidly (adaptive radiation).
    • -In cases where many plesiomorphic (ancestral) traits are
    • retained in different groups, method may place members
    • in same clade (tendency towards monophyletic categorization).
  169. Advantages of phenetics
    • -Useful if want to detect distinctness of a taxon (often useful
    • for species level studies or detecting horizontal gene transfer).
    • -Computationally cheap
  170. Cladistics
    • -A method to classify organisms into clades (a group
    • containing the ancestor and all of its descendants)
    • -Focuses on synapomorphies (new differences)
    • -A more robust method for determining phylogenetic
    • relationships
    • -Use cladograms to show relationships between species
  171. Plesiomorphic:
    • Traits inherited
    • from an ancestor (Ex: canine teeth).
    • Uninformative trait since all have this
  172. Apomorphic
    • Newly evolved
    • trait (Ex: flat face in Persians).
  173. Synapomorphic:
    • New traits
    • shared by members of the same
    • clade and their most recent common
    • ancestor (Ex: dark tips in Siamese,
    • Burmese, Singapura, etc.)
  174. Cladogram
    • -A branching diagram
    • obtained using cladistic
    • analysis that represents the
    • hypothesized relationships
    • among taxa.
    • -Branch lengths are
    • uninformative
    • Measure “characters” and find tree with the fewest transitions between different states.
  175. Phylogram
    • Branch lengths reflect number of character changes that
    • occurred during evolution.
    • Example: In a phylogram generated using DNA, a branch length of 4 units means
    • that an average of 4 substitutions occurred at each nucleotide site.
  176. Chronogram
    Branch lengths reflect evolutionary time required for changes.
  177. What sorts of information can be
    used to generate phylogenies?
    • -Morphological traits
    • -Protein or gene sequences (coding seqs)
    • -Non-coding DNA/RNA
  178. Mitochondrial DNA
    • -Useful to track ancestry through
    • female lineage.
    • -Faster mutation rate than nuclear
    • DNA (in humans) makes it easier
    • to detect recent evolutionary events.
    • “Molecular Clock”
    • -Evidence that maternal
    • mitochondrial DNA may mix with
    • paternal mitochondrial DNA
  179. Ribosomal RNA
    • -Often chosen because
    • ribosomal RNA contains
    • both highly conserved
    • (18S) and highly variable
    • (NTs…non-transcribed
    • spacer) sequences.
    • -Therefore it can be
    • used either to look at very
    • distant groups or to
    • compare closely related
    • species
  180. Why generate phylogenies
    for genomics?
    • •To classify a gene into the appropriate subfamily (gene
    • identification)
    • •To test predictions about relatedness and origins of
    • genes in a multigene family
    • •To understand how organisms are related
  181. Unrooted trees
    • -Specify relationships between members of tree, but do not
    • indicate the evolutionary path.
    • -Does not assume that members share a common ancestor.
  182. Midpoint rooting:
    • Root in middle of longest path between two
    • most distantly related. Assumes rate of evolution is same along all
    • branches…often this is not a reliable assumption.
  183. How to place the root
    • Midpoint rooting
    • Rooting with an outgroup
  184. Rooting with an outgroup
    • Choose a taxon that is more
    • distantly related to all ingroup taxa than any of the ingroup taxa are
    • related to each other. Needs to be DISTANTLY related or may have
    • same problem as above…unequal evolutionary rates….outgroup
    • needs to be homologous.
  185. Two approaches to tree construction
    • Algorithmic: Uses an algorithm to generate a
    • tree from the data.
    • Tree-Searching: Program constructs many trees
    • and compares them.
  186. Algorithmic: Uses an algorithm to generate a
    tree from the data.
    Advantages
    • -Fast
    • -Produces only one tree per dataset.
  187. Methods using Algorithmic tree construction
    • • Distance methods
    • -A phenetic or clustering approach
    • -Use direct comparison of
    • sequences to assess similarity.
    • Examples:
    • -UPGMA (Unweighted Pair-Group
    • Method with Arithmetic Mean)
    • -Neighbor Joining
  188. UPGMA steps
    At each step, the nearest two clusters are combined into a higher-level cluster. The distance between any two clusters A and B is taken to be the average of all distances between pairs of objects "x" in A and "y" in B, that is, the mean distance between elements of each cluster:
  189. UPGMA problems
    • -Assumes that rate of nucleotide or amino acid substitution is
    • same in all branches of tree (constant rate of evolution). If not
    • true, produces incorrect tree.
    • -Branch lengths only show averages for each pair
    • -Less accurate than other methods…less likely to be used for
    • phylogenies.
  190. UPGMA
    • (Unweighted Pair Group Method with Arithmetic Mean)
    • is a simple agglomerative (bottom-up) hierarchical clustering method
  191. Neighbor:
    • a pair of taxa
    • connected through a
    • single interior node X
    • in an unrooted,
    • bifurcating tree
  192. Producing a Neighbor-Joining tree
    • Step 1 - Taxa clustered in
    • a star-like tree.
    • Step 2 - Pairwise comparisons
    • made to identify the two
    • taxa with shortest sum
    • of branch lengths (most similar).
    • Step 3 - Selected taxa
    • connected to others
    • via internal branch XY.
    • Step 4 - H/F now treated as 1 taxon. Search again for pair most similar. Could be
    • 2 different taxa or 1 may be similar to H/F. Repeat until N-3 branches found.
  193. Producing a Neighbor-Joining tree
    • -Produces an unrooted tree unless outgroup specified.
    • -Final tree may not be the one with shortest overall
    • branch lengths.
    • -Good approach to use if studying a large group of taxa.
  194. Tree-Searching
    • Program constructs many trees
    • and compares them.
  195. Tree-Searching construction advantage
    • Produces subset of trees consistent with dataset.
    • -Allows you to evaluate the alternative possibilities.
  196. Methods using tree searching tree construction
    • -Parsimony
    • -Maximum likelihood
    • -Bayesian analysis
  197. Parsimony assumption
    • 1- Most likely tree has fewest changes in sequence.
    • 2- Taxa with common characteristics share them
    • because they inherited these characteristics from
    • the same common ancestor.
  198. Homoplasies
    • Similarities that
    • require “extra” steps
    • or hypotheses
    • to explain data
  199. Why parsimony assumptions not true sometimes
    • -Reversal (character changes but reverted back)
    • -Convergence (traits evolved independently
    • in two unrelated taxa)
    • -Parallelism (different taxa have similar properties
    • that predispose a characteristic to develop a
    • certain way…ex: if early development similar
    • limits later stages possible)
  200. Parsimony
    • select trees that minimize homoplasy
    • (Similarities that
    • require “extra” steps
    • or hypotheses
    • to explain data)
  201. Example of a method using parsimony:
    : Maximum Parsimony
  202. Generating a tree using maximum parsimony
    • Step 1 - Identify informative sites
    • -Need at least 2 character states in at least
    • 2 taxa.
    • Step 2 - Construct trees
    • -If less than 12 taxa, look at all trees.
    • -With more than 12 taxa, use heuristic search
    • to ignore options unlikely to produce shortest tree.
    • Step 3 - Count the number of changes and select tree
    • with the shortest branch lengths (fewest changes).
    • -This is the most parsimonious tree!
  203. Uninformative nucleotides in maximum parsimony
    • 1- invariant
    • 2- unique (only 1 is different)
    • 3- all differ
    • 4- in analysis, always show 2 changes…can
    • ’t tell
    • whether ancestral or derived
  204. Tree-Searching Methods:
    • -Parsimony
    • -Maximum likelihood
    • -Bayesian analysis
  205. Maximum likelihood
    • -A statistical approach that uses a model of evolution.
    • -Always produces 1 tree.
    • -Allows you to decide which evolutionary model to use.
    • -Lets you know the likelihood of generating that tree.
    • -If rates of evolution vary in branches it is not a problem.
  206. Problems with parsimony
    Long branch chain attraction
  207. Long branch chain attraction
    Parsimony can misinterpret data because of differences in evolutionary rate. Misinterpret organisms that evolved fast
  208. How likelihood generates tree
    • -Choose a model of evolution (models of nucleotide or aa substitution).
    • -Assigns probabilities to mutational events.
    • -Assigns branch lengths based on probabilities that a mutation will occur.
    • -Tree with highest likelihood assumed most likely to occur.
  209. Maximum likelihood disadvantages
    • -Slow…takes much longer to produce and requires
    • strong computing capabilities.
  210. Heuristic approaches
    • Stepwise addition
    • star decomposition
    • branch swapping
  211. Bayesian inference
    • uses your dataset to produce sets
    • of trees with greatest likelihood (based on data
    • and evolutionary model)….“looked at
    • the data and here are several trees that could
    • fit your data.”
    • based on notion of POSTERIOR PROBABILITIES
    • (coin toss analogy)
  212. Posterior probabilities
    • probabilities that are
    • estimated based on a model (prior
    • experiences), after learning something
    • about the data.
  213. Bootstrapping
    • -Original sequence used to make tree
    • -Randomly generate another sequence
    • by shifting columns…can use same
    • column more than 1X…or can lose a
    • column.
    • -Make tree with new sequences. Do you
    • get same clades?
    • -Score 1 if that clade present, 0 if missing.
    • -Repeat process 100 to 1,000 times or more.
    • -Bootstrap value 90%? Pretty reliable tree.
    • Bootstrap value 25%? Tree not very reliable
  214. Tree-searching methods
    • -Exhaustive search
    • -Branch-and-Bound
    • -Heuristic strategies
  215. Exhaustive search
    • -Compare all
    • possible trees
  216. Branch-and-bound
    • -Make a random tree
    • for comparison.
    • -Start with tree A.
    • -Consider all trees at
    • level B and choose
    • the lowest scoring.
    • -Repeat at next level...
    • try to find tree that
    • has lower score than
    • the one you’
    • re
    • comparing to.
    • -Start over with new
    • tree and try to find
    • tree with lower score.
  217. Stepwise addition method
    • -Start with tree A.
    • -Consider all trees at
    • level B and choose
    • the lowest scoring.
    • -Repeat at next level
  218. Star decomposition method
    • -Start with trees joined
    • at 1 internal node.
    • -Join 2 terminal nodes
    • and keep best scoring
    • tree.
    • -Repeat at next level.
  219. adaptive radiation
    process in which organisms diversify rapidly into a multitude of new forms, particularly when a change in the environment makes new resources available, creates new challenges, or opens new environmental niches.
  220. Distance methods
    • -A phenetic or clustering approach
    • -Use direct comparison of
    • sequences to assess similarity.
    • Examples:
    • -UPGMA (Unweighted Pair-Group
    • Method with Arithmetic Mean)
    • -Neighbor Joining
  221. Branch swapping method
    • -Take tree of suspected minimal length
    • -“Prune” a branch and experimentally add
    • it to the internodes in the remaining tree.
    • -Can you generate a lower score?
    • -Takes less time than searching all trees

What would you like to do?

Home > Flashcards > Print Preview