Function of duplicate genes
The supplementary data for Roux et al 2017 Selective Constraints on Coding Sequences of Nervous System Genes Are a Major Determinant of Duplicate Gene Retention in Vertebrates is available on github.
Browse this data and find the gene lists for zebrafish 3R ohnologs (fish whole genome duplication paralogs), zebrafish 3R singletons (genes returned to singleton after the genome duplication), and zebrafish SSD (small scale duplicates).
Using a tool of your choice (e.g., cut in command line, R, Excel, Google Spreadsheets), create a list only of Ensembl gene identifiers for each list; they should be of the form ENSDARG0000xxxxxxx.
Create also a list with all the Ensembl identifiers from the three lists.
Gene Ontology
To perform a Gene Ontology (GO) enrichment, go to PantherDB (you can also find the link from the geneontology.org page).
Chose Panther tools, Gene list analysis.
Upload one of your gene lists (3R, singleton or SSD), chose the correct species, and chose "Statistical overrepresentation test".
Important! Uncheck "Use default settings", to allow you to upload your own background file.
On the next page, change "Reference List" to your concatenated list of all genes in the three lists.
Explore results using different gene lists and different sets of GO annotations.
TopAnat
Go to the Bgee TopAnat page. Similarly to the GO enrichment test, paste a gene list of interest (SSD, singleton or 3R) as "Gene list". Change the background to "Custom data" and paste your concatenated list of all genes.
The analysis can take some time. Check with your neighbors to calculate with different gene sets or using different options.