Genomics Quiz 6

The flashcards below were created by user johnpc on FreezingBlue Flashcards.

  1. Why perform multiple sequence alignments?
    • -To detect mutations, insertions, or deletions in similar sequences
    • -To find domains
    • -To identify secondary structure (alpha helices, beta pleated sheets, etc.)
    • -Useful to display in a publication
    • -To examine evolutionary relationships or establish homology
    • (Multiple sequence alignment Phylogenetic analysis)
  2. Examples of programs:
    • Clustal Omega
    • Tcoffee
    • Muscle
  3. Phylogenetic analysis for…
    • -Gene family identification
    • -Inferring gene functions
    • -Finding origins of a genetic disorder
    • -Determining evolutionary relationships
  4. Phenetics analysis
    • -Classifies based on similarity alone
    • -Does not consider evolutionary relationships of species
  5. Disadvantages of phenetics
    • -May miscategorize organisms based on similarities due to
    • convergent evolution
    • -May give misleading results in closely related species that
    • acquired phenotypic differences rapidly (adaptive radiation).
    • -In cases where many plesiomorphic (ancestral) traits are
    • retained in different groups, method may place members
    • in same clade (tendency towards monophyletic categorization).
  6. Advantages of phenetics
    • -Useful if want to detect distinctness of a taxon (often useful
    • for species level studies or detecting horizontal gene transfer).
    • -Computationally cheap
  7. Cladistics
    • -A method to classify organisms into clades (a group
    • containing the ancestor and all of its descendants)
    • -Focuses on synapomorphies (new differences)
    • -A more robust method for determining phylogenetic
    • relationships
    • -Use cladograms to show relationships between species
  8. Plesiomorphic:
    • Traits inherited
    • from an ancestor (Ex: canine teeth).
    • Uninformative trait since all have this
  9. Apomorphic
    • Newly evolved
    • trait (Ex: flat face in Persians).
  10. Synapomorphic:
    • New traits
    • shared by members of the same
    • clade and their most recent common
    • ancestor (Ex: dark tips in Siamese,
    • Burmese, Singapura, etc.)
  11. Cladogram
    • -A branching diagram
    • obtained using cladistic
    • analysis that represents the
    • hypothesized relationships
    • among taxa.
    • -Branch lengths are
    • uninformative
    • Measure “characters” and find tree with the fewest transitions between different states.
  12. Phylogram
    • Branch lengths reflect number of character changes that
    • occurred during evolution.
    • Example: In a phylogram generated using DNA, a branch length of 4 units means
    • that an average of 4 substitutions occurred at each nucleotide site.
  13. Chronogram
    Branch lengths reflect evolutionary time required for changes.
  14. What sorts of information can be
    used to generate phylogenies?
    • -Morphological traits
    • -Protein or gene sequences (coding seqs)
    • -Non-coding DNA/RNA
  15. Mitochondrial DNA
    • -Useful to track ancestry through
    • female lineage.
    • -Faster mutation rate than nuclear
    • DNA (in humans) makes it easier
    • to detect recent evolutionary events.
    • “Molecular Clock”
    • -Evidence that maternal
    • mitochondrial DNA may mix with
    • paternal mitochondrial DNA
  16. Ribosomal RNA
    • -Often chosen because
    • ribosomal RNA contains
    • both highly conserved
    • (18S) and highly variable
    • (NTs…non-transcribed
    • spacer) sequences.
    • -Therefore it can be
    • used either to look at very
    • distant groups or to
    • compare closely related
    • species
  17. Why generate phylogenies
    for genomics?
    • •To classify a gene into the appropriate subfamily (gene
    • identification)
    • •To test predictions about relatedness and origins of
    • genes in a multigene family
    • •To understand how organisms are related
  18. Unrooted trees
    • -Specify relationships between members of tree, but do not
    • indicate the evolutionary path.
    • -Does not assume that members share a common ancestor.
  19. Midpoint rooting:
    • Root in middle of longest path between two
    • most distantly related. Assumes rate of evolution is same along all
    • branches…often this is not a reliable assumption.
  20. How to place the root
    • Midpoint rooting
    • Rooting with an outgroup
  21. Rooting with an outgroup
    • Choose a taxon that is more
    • distantly related to all ingroup taxa than any of the ingroup taxa are
    • related to each other. Needs to be DISTANTLY related or may have
    • same problem as above…unequal evolutionary rates….outgroup
    • needs to be homologous.
  22. Two approaches to tree construction
    • Algorithmic: Uses an algorithm to generate a
    • tree from the data.
    • Tree-Searching: Program constructs many trees
    • and compares them.
  23. Algorithmic: Uses an algorithm to generate a
    tree from the data.
    • -Fast
    • -Produces only one tree per dataset.
  24. Methods using Algorithmic tree construction
    • • Distance methods
    • -A phenetic or clustering approach
    • -Use direct comparison of
    • sequences to assess similarity.
    • Examples:
    • -UPGMA (Unweighted Pair-Group
    • Method with Arithmetic Mean)
    • -Neighbor Joining
  25. UPGMA
  26. Neighbor:
    • a pair of taxa
    • connected through a
    • single interior node X
    • in an unrooted,
    • bifurcating tree
  27. Producing a Neighbor-Joining tree
    • Step 1 - Taxa clustered in
    • a star-like tree.
    • Step 2 - Pairwise comparisons
    • made to identify the two
    • taxa with shortest sum
    • of branch lengths (most similar).
    • Step 3 - Selected taxa
    • connected to others
    • via internal branch XY.
    • Step 4 - H/F now treated as 1 taxon. Search again for pair most similar. Could be
    • 2 different taxa or 1 may be similar to H/F. Repeat until N-3 branches found.
  28. Producing a Neighbor-Joining tree
    • -Produces an unrooted tree unless outgroup specified.
    • -Final tree may not be the one with shortest overall
    • branch lengths.
    • -Good approach to use if studying a large group of taxa.
  29. Tree-Searching
    • Program constructs many trees
    • and compares them.
  30. Tree-Searching construction advantage
    • Produces subset of trees consistent with dataset.
    • -Allows you to evaluate the alternative possibilities.
  31. Methods using tree searching tree construction
    • -Parsimony
    • -Maximum likelihood
    • -Bayesian analysis
  32. Parsimony assumption
    • 1- Most likely tree has fewest changes in sequence.
    • 2- Taxa with common characteristics share them
    • because they inherited these characteristics from
    • the same common ancestor.
  33. Homoplasies
    • Similarities that
    • require “extra” steps
    • or hypotheses
    • to explain data
  34. Why parsimony assumptions not true sometimes
    • -Reversal (character changes but reverted back)
    • -Convergence (traits evolved independently
    • in two unrelated taxa)
    • -Parallelism (different taxa have similar properties
    • that predispose a characteristic to develop a
    • certain way…ex: if early development similar
    • limits later stages possible)
  35. Parsimony
    • select trees that minimize homoplasy
    • (Similarities that
    • require “extra” steps
    • or hypotheses
    • to explain data)
  36. Example of a method using parsimony:
    : Maximum Parsimony
  37. Generating a tree using maximum parsimony
    • Step 1 - Identify informative sites
    • -Need at least 2 character states in at least
    • 2 taxa.
    • Step 2 - Construct trees
    • -If less than 12 taxa, look at all trees.
    • -With more than 12 taxa, use heuristic search
    • to ignore options unlikely to produce shortest tree.
    • Step 3 - Count the number of changes and select tree
    • with the shortest branch lengths (fewest changes).
    • -This is the most parsimonious tree!
  38. Uninformative nucleotides in maximum parsimony
    • 1- invariant
    • 2- unique (only 1 is different)
    • 3- all differ
    • 4- in analysis, always show 2 changes…can
    • ’t tell
    • whether ancestral or derived
  39. Tree-Searching Methods:
    • -Parsimony
    • -Maximum likelihood
    • -Bayesian analysis
  40. Maximum likelihood
    • -A statistical approach that uses a model of evolution.
    • -Always produces 1 tree.
    • -Allows you to decide which evolutionary model to use.
    • -Lets you know the likelihood of generating that tree.
    • -If rates of evolution vary in branches it is not a problem.
  41. Problems with parsimony
    Long branch chain attraction
  42. Long branch chain attraction
    Parsimony can misinterpret data because of differences in evolutionary rate. Misinterpret organisms that evolved fast
  43. How likelihood generates tree
    • -Choose a model of evolution (models of nucleotide or aa substitution).
    • -Assigns probabilities to mutational events.
    • -Assigns branch lengths based on probabilities that a mutation will occur.
    • -Tree with highest likelihood assumed most likely to occur.
  44. Maximum likelihood disadvantages
    • -Slow…takes much longer to produce and requires
    • strong computing capabilities.
  45. Bayesian inference
    • uses your dataset to produce sets
    • of trees with greatest likelihood (based on data
    • and evolutionary model)….“looked at
    • the data and here are several trees that could
    • fit your data.”
    • based on notion of POSTERIOR PROBABILITIES
  46. Posterior probabilities
    • probabilities that are
    • estimated based on a model (prior
    • experiences), after learning something
    • about the data.
  47. Bootstrapping
    • -Original sequence used to make tree
    • -Randomly generate another sequence
    • by shifting columns…can use same
    • column more than 1X…or can lose a
    • column.
    • -Make tree with new sequences. Do you
    • get same clades?
    • -Score 1 if that clade present, 0 if missing.
    • -Repeat process 100 to 1,000 times or more.
    • -Bootstrap value 90%? Pretty reliable tree.
    • Bootstrap value 25%? Tree not very reliable
  48. Tree-searching methods
    • -Exhaustive search
    • -Branch-and-Bound
    • -Heuristic strategies
  49. Exhaustive search
    • -Compare all
    • possible trees
  50. Branch-and-bound
    • -Make a random tree
    • for comparison.
    • -Start with tree A.
    • -Consider all trees at
    • level B and choose
    • the lowest scoring.
    • -Repeat at next level...
    • try to find tree that
    • has lower score than
    • the one you’
    • re
    • comparing to.
    • -Start over with new
    • tree and try to find
    • tree with lower score.
  51. Stepwise addition method
    • -Start with tree A.
    • -Consider all trees at
    • level B and choose
    • the lowest scoring.
    • -Repeat at next level
  52. Star decomposition method
    • -Start with trees joined
    • at 1 internal node.
    • -Join 2 terminal nodes
    • and keep best scoring
    • tree.
    • -Repeat at next level.
  53. Branch swapping method
    • -Take tree of suspected minimal length
    • -“Prune” a branch and experimentally add
    • it to the internodes in the remaining tree.
    • -Can you generate a lower score?
    • -Takes less time than searching all trees
Card Set:
Genomics Quiz 6
2015-11-13 16:33:20
Show Answers: