There are many version of blast searches. The simplest are:
- blastX : computes all 6 ORFs and matches them against a chosen protein database, such as NR.
- blastN: searches the given ORF against a chosen nucleotide database, such as EMBL.
- blastp: search a sequence of amino-acids against a chosen database.
Example gene description: Fields are only filled in if appropriate information is found.
- Read and know Chapter 2. Your next assignment will involve coding a similarity search.
- I will provide everyone with the three ORFs in the file ThreeMysteryOrfs.txt in Masterhit for the course.. Go to the Yeast database (http://mips.gsf.de/genre/proj/yeast/index.jsp) and try to identity each ORFs. To do this look at the first 20,000 bps of chromosome I. Hand in (I mean deposit) a description of each gene, as indicated below.
- Using basic blast (http://www.ncbi.nlm.nih.gov/BLAST), run blastx on each of the orfs. Specify that your sequence is dna and that you want to search the NR data base. NR stands for non-redundant, however it is redundant. Do not change any of the defaults. Depending on the load, this may take 1/2 hour. Do not wait till the last minute or the load can be very bad. Use the three top hits from blastx (if there are 3 hits) and describe the genes found as before. To get a description follow the first hot-link provided.
- Again use basic blast, but this time run blastn on the three mystery orfs. Again describe the genes that are found, as below. Only use the top most hit and the first hot-link. In this case you need to run blastn on each sequence separately. Select the database EMBL.
Note: Each of questions 2, 3, 4 require that you describe the genes. You may get different descriptions from the different tools.
- gene name(s):
- gene product(s): These may be hypothetical, but note that.
- gene function(s): These may be hypothetical, but note that.