Exercise session 1
Exercise 1
OMIM: identify disease related genes
Tasks:
• choose a gene and identify terms that
characterize it. some examples: Alzheimer, breast cancer
• What human genes are related to hypertension?
• Retrieve the OMIM record for the cystic fibrosis transmembrane conductance regulator (CFTR), and link to related protein sequence records in Entrez.
• Find the OMIM record for the p53 tumor protein, and linkout to related information in Entrez Gene and the p53 Mutation Database.
* gene with known sequence
+ gene with known sequence and phenotype
# phenotype description, molecular basis known
% mendelian phenotype or locus, molecular basis unknown
other, mainly phenotypes with
Exercise 1
Exercise 1
GO: identify genes and concepts
Tasks:
• Use [Gene Ontology] and try to identify one or more relevant terms to describe a specific gene.
• Use the selected term(s) to find the gene in [Entrez Gene]. Be aware of the species you select when querying Entrez Gene.
• Describe the information found in the record of your selected gene.
• See if there are additional links to other
resources like KEGG Pathways, UniGENE, ...
• What kind of information is stored under these additional links?
Exercise 1
Sequence Level
Tasks:
• Select a Genbank sequence from the Entrez Gene record of your gene.
• What kind of information is stored in the Genbank record?
• Save the protein sequence of your gene in fasta format. Use the display and send to file buttons.
• Select a set of protein sequences related to your gene.
• Save a selection five to ten protein neighbours in one fasta file.
Exercise 1
Sequence Level
Tasks:
• Locate the gene in the complete human
genome. Use the Ensembl genome browser.
• Use the accession number of your gene to find the info on Ensembl.
• Where is your gene located on the genome?
• Are there any alternative transcripts or orthologues found?
• How is this information found (homology,
prediction, ESTs) and is there any information according its reliability?
Exercise 1
Sequence Level
Tasks:
• With BLAST you can identify homologous
sequences of your gene in the huge genome database.
• Do a blastn with the saved DNA sequence.
• Do a blastp with the saved protein sequence
• How well do these results match? What are the differences if there are any?
• Do the hits retrieved with blastp correspond to the neighbouring proteins you saved before?
Exercise 1
Sequence Level
Tasks:
• To functionally annotate a given protein, we can use several prediction tools. A large
collection of tools is available at [Expasy].
• Start with the ScanProsite tool to find specific domains in your protein sequence.
• Are there any relevant features found?
• Try the MotifScan and InterProScan tools.
• What kind of feature are predicted and how do the different tools correspond to each other?
• How do the predictions correspond to the