Genomics Quiz 5
Card Set Information
Genomics Quiz 5
To determine how many genes were necessary for
Any manipulation that reduces expression of a specific gene.
Gene knockdowns approaches
-Overexpress a dominant negative
-Inject a morpholino
Overexpress a dominant negative Example
Tyrosine kinase receptor
Inject mRNA that will be translated into a mutant subunit protein
that lacks the site needed to initiate signaling
Ligand binds to dimer, but no signaling activated
•Like RNA but ribose replaced with morpholine group;
phosphodiester backbone modified.
•Resists degradation by cell.
•Complementary to RNA sequences.
Translation blocking morpholinos
Binds near initiation site (overlapping portions of the 5’UTR
and the first exon), interfering with ribosome assembly
and prevents translation.
Homologous recombination (knockout)
Relies on recombination between a vector carrying positive and negative
Recombination occurs between identical sequences in vector and chromosomal DNA.
Often organisms appear not to “need” many of their genes
-If you knockdown or knockout any single gene in a mouse, most of
the time the mouse will survive.
-Only about 30% of the genes are necessary for survival!
-In yeast, more than 4,700 single-gene knockouts were performed in
homozygous diploid lines. Only 10.7% exhibited reduced growth/
viability. Growth of 83.5% of the knockouts was unaffected
redundancy refers to pairs of homologous genes
with functional overlap where one can compensate
for loss of the other.
Sources of genetic redundancy?
Implications of high occurrence of redundancy in signaling components?
•Functional overlap in redundant genes may be beneficial in
maintaining ability to signal.
Examples of redundant signaling proteins?
-Myogenic regulators (MyoD, Myf5, myogenin, Mrf4)
-In study of 59 pairs of redundant genes (yeast), the
redundant forms were
not expressed at the same
time and/or place. Expression patterns didn’t overlap.
In vertebrate developmental pathways, redundant
expressed in spatially or temporally
Knocking out one gene often results in
upregulation of the
Redundancy may allow
compensation when one isoform lost
-Early in vertebrate development, segmented blocks of tissue appear on either side of the nerve cord
-The somites form muscle, tendons, endothelial cells, dermis, and cartilage.
Portions of the somite forming
adult skeletal muscle divided
into two domains…
: dorso-medial region (ep)
: ventrolateral region (hyp)
Redundancy may allow adaptation to
Redundancy can be used to increase ability of cell
to sense changes in
environment and respond.
Redundancy may improve
processing of external info
Gene redundancy may provide important opportunities
for the organism…
-Compensating for loss of another molecule
-Adapting to changing external factors
-Improved processing of external information
-How are new gene families created?
A) Divergent evolution
B) Concerted evolution
C) Birth-and-death evolution
A group of genes is duplicated.
-Evolution of gene A doesn’t
affect evolution of gene B.
-Each gene gradually diverges
as mutations accumulate.
-Duplicate genes assume new
: alpha and beta globins
-Gene family members do not evolve
independently. Evolve at same time.
-In ribosomal RNA of frogs, find that
intergenic regions more similar in
different rRNAs of same species than
in two related Xenopus species.
-Why? If one gene acquires a mutation,
it spreads to the other rRNAs by
unequal crossing over or by
: 5S rRNA in Xenopus
Primate U2 snRNA
-New genes made by gene duplication.
-Some remain, others become
pseudogenes or are deleted.
-Pattern in phylogenetic tree is more
difficult to interpret.
: Major histocompatibility complex genes
Instead of focusing on genes, look at evolution
of protein domains.
All the domains from a dataset is the “domainome” for that
group of sequences.
How can the g-value paradox be explained?
-More potential combinations between proteins
-Roles of non-coding RNA
assumes that once a complex trait is lost in a lineage, it cannot be
Domains acting in cell regulation
Domains related to metabolism
Suggests that roles of metabolic domains may be taken over by symbionts
High number of domains in basal groups supports idea that
common ancestor of eukaryotes was complex.
Class I elements
Class II elements
Insertion of transposable elements can harm the host by
Insertional mutagensis, chimeric transcript production, antisense effects, and illegitimate recombination
How does the genome defend itself from transposable elements?
1- In cytoplasm, Dicer binds
2- Dicer cleaves dsRNA to
form small interfering RNAs
(siRNAs, 21-23 bp long w/2 bp
overhang at 5’ end).
3- The RNA-induced silencing
complex (RISC) forms when the
antisense strand of siRNA
associates with Argonaute 2
(AGO2) protein. May also include
other protein types.
4- RISC complex scans RNAs
to find complementary
sequence to siRNA. siRNA
binds sense strand of
target mRNA and RISC
complex cleaves target.
1- MicroRNA is transcribed from DNA.
2-These short RNA sequences form
haipin loops and are transported to
the cytoplasm by exportin5.
3- In the cytoplasm, Dicer trims the
dsRNA (22 bp seq with overhang).
4- miRNAs silence gene expression by
binding complementary regions in
the 3’ UTR of target mRNAs
(animals) or by binding coding
regions of target mRNAs (plants).
Base pairing is often partial and binding of the miRNA affects multiple mRNA types.
Effects? Inhibiting translation, causing loss of poly-A tail, interfering with methylated cap/
poly-A tail interactions, or causing mRNA degradation by exonucleases.
-Bacteria use DNA methylation to protect the genome from
degradation by restriction enzymes.
-Restriction enzymes cannot destroy methylated restriction sites
but can cleave unmethylated restriction sites.
-Endogenous methyltransferases attach methyl groups to
cytosines or adenines.
-In eukaryotes, modified 5-methyl-
cytosine is made by adding a methyl
group to the 5 position of cytosine.
-Causes transcriptional silencing when
promoter region methylated. Good for
long-term silencing. Doesn’t appear to
-Repetitive DNA in plants and
mammals is usually methylated.
-A method to block retroviral replication.
-In primates, during reverse transcription,
the host protein APOBEC3G is incorporated
in 1st strand cDNA.
-It deaminates cytosines in the retrovirus
cDNA strand, converting them to uracils.
-During second strand synthesis, uracil is
recognized as thymidine, so adenine is
inserted in the new DNA strand.
-Deactivates virus by mutating up to 25%
of cDNA guanine residues.
-Not as effective in HIV-1 retrovirus. This
virus makes Vif (viral interference factor)
which inhibits activity of the APOBEC3G protein.
WHY Protein alignments are more useful if you’re comparing distantly
-Peptide sequences are more likely to be conserved than nucleotide
sequences since there are multiple codons for the same amino acid.
- Some amino acids have similar biophysical properties. Similarities can
be accounted for in the protein scoring matrices.
-Similarities arising early in the evolutionary process may be detectable
using a protein alignment, but not a DNA alignment.
WHY Nucleotide alignments can be more useful if you’re comparing closely
-If sequences are very closely related, the amino acid sequences may
not vary much. Nucleotide sequence will vary more, allowing easier
detection of differences.
Global Sequence Alignment
-Tries to align two sequences along entire length.
-Best for highly similar sequences of same length.
-As similarities decrease, misses important relationships.
Local Sequence Alignment
-Looks for the most similar regions in sequence instead
of trying to align entire length.
-May return more than 1 result if there is more than
1 subsequence in common.
-Good method to use if sequences differ in length or
share partial similarity.
-A simple way to visually compare 2 sequences to
find local alignments, direct repeats, inverted
repeats, insertions, deletions or low-complexity regions
Why aren’t dot plots used more?
Do not provide a measure of statistical similarity
-Score is the log of an odds ratio. Considers how often, in nature,
a particular residue is substituted for another versus how often
this substitution would occur if by random chance.
Si,j = log [ ( qi,j ) / ( pi
pj ) ]
Si,j is the score for a replacement of residue i with residue j.
qi,j represents how often the two amino acids align with each
other in multiple sequence alignments of protein groups.
is the probability that residue i will occur among all proteins
pj is the probability that residue j will occur among all proteins
Relies on “seeding”. Looks for a short query word. Finds this and
-To score related words, uses a scoring matrix called “The neighborhood.”
-Threshold setting controls how many options allowed in the neighborhood.
-Then performs local alignment. Extends until gaps and mismatches decrease
score below the score threshold (S)…This info recorded by BLAST.
-If enter a sequence from a low complexity region can
inflate BLAST scores.
(Masking with DUST or SEG counter this.)
-The hit list contains entries that represent hypothetical
-Hits to ESTs should be treated with caution. Sequencing
accuracy is lower than in “finished”
How does BLAST determine length
Program measures cumulative
score as alignment extended
•If angle of drop off after a peak
exceeds a threshold value (X),
extension terminated and trims
alignment to preceding peak in
•HSP = high-scoring segment pair
and is the trimmed alignment •Then calculates E value to determine if alignment