Exercise1 ArrayExpress Archive - High-throughput sequencing example

Similar documents
ArrayExpress and Gene Expression Atlas: Mining Functional Genomics data

The human gene encoding Glucose-6-phosphate dehydrogenase (G6PD) is located on chromosome X in cytogenetic band q28.

Introduction to BIOINFORMATICS

Analysis of Microarray Data

Analysis of Microarray Data

Gene-Level Analysis of Exon Array Data using Partek Genomics Suite 6.6

ArrayExpress: Quick tour

Overview. General. RNA & Microarrays. Information Systems in the Life Sciences

The first and only fully-integrated microarray instrument for hands-free array processing

Applied Bioinformatics

GS Analysis of Microarray Data

Week 1 BCHM 6280 Tutorial: Gene specific information using NCBI, Ensembl and genome viewers

Deakin Research Online

BCHM 6280 Tutorial: Gene specific information using NCBI, Ensembl and genome viewers

Training materials.

Introduction to Microarray Analysis

What we ll do today. Types of stem cells. Do engineered ips and ES cells have. What genes are special in stem cells?

Do engineered ips and ES cells have similar molecular signatures?

Data and Metadata Models Recommendations Version 1.2 Developed by the IHEC Metadata Standards Workgroup

GS Analysis of Microarray Data

FUNCTIONAL BIOINFORMATICS

GS Analysis of Microarray Data

Knowledge-Guided Analysis with KnowEnG Lab

user s guide Question 3

Genetics and Bioinformatics

Biology 644: Bioinformatics

Microarray Informatics

A Microarray Analysis Teaching Module. for Hamilton College. July 2008 Megan Cole Post-doctoral Associate Whitehead Institute, MIT

Bioinformatics for Biologists

TUTORIAL. Revised in Apr 2015

Browser Exercises - I. Alignments and Comparative genomics

Biotechnology Explorer

Hands-On Four Investigating Inherited Diseases

NGS Approaches to Epigenomics

Motif Discovery in Drosophila

Upstream/Downstream Relation Detection of Signaling Molecules using Microarray Data

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science

Table of Contents. 1. What is CREP and when to use it?

Analysis of a Tiling Regulation Study in Partek Genomics Suite 6.6

Outline. Array platform considerations: Comparison between the technologies available in microarrays

Mouse Genome Informatics (MGI) Workshop

Population data, SNPs and alleles Exercise 4

PeCan Data Portal. rnal/v48/n1/full/ng.3466.html

OncoMD User Manual Version 2.6. OncoMD: Cancer Analytics Platform

Using 2-way ANOVA to dissect the immune response to hookworm infection in mouse lung

Functional analysis using EBI Metagenomics

KnetMiner USER TUTORIAL

Kyoto Encyclopedia of Genes and Genomes (KEGG)

Minimum Information About a Microarray Experiment (MIAME) Successes, Failures, Challenges

Agilent GeneSpring GX 10: Beyond. Pam Tangvoranuntakul Product Manager, GeneSpring October 1, 2008

Introduction to RNA-Seq in GeneSpring NGS Software

Discover the Microbes Within: The Wolbachia Project. Bioinformatics Lab

Introduction to Bioinformatics CPSC 265. What is bioinformatics? Textbooks

Quick reference guide

Welcome! Introduction to High Throughput Genomics December Norwegian Microarray Consortium FUGE Bioinformatics platform

Build Your Own Gene Expression Analysis Panels

Gene expression analysis. Biosciences 741: Genomics Fall, 2013 Week 5. Gene expression analysis

Object Groups. SRI International Bioinformatics

earray 5.0 Create your own Custom Microarray Design

Genomics Market Share, Size, Analysis, Growth, Trends and Forecasts to 2024 Hexa Research

Browsing Genes and Genomes with Ensembl

Measuring and Understanding Gene Expression

BIMM 143: Introduction to Bioinformatics (Winter 2018)

BLASTing through the kingdom of life

2/19/13. Contents. Applications of HMMs in Epigenomics

QPCR ASSAYS FOR MIRNA EXPRESSION PROFILING

Ph.D. Program in Genetics, Genomics, and Cancer Biology

Analyzing Affymetrix GeneChip SNP 6 Copy Number Data in Partek

Applications of HMMs in Epigenomics

Expression Array System

Bioinformatics Analysis of Nano-based Omics Data

Introduction to Bioinformatics and Gene Expression Technology

Briefly, this exercise can be summarised by the follow flowchart:

SeattleSNPs Interactive Tutorial: Database Inteface Entrez, dbsnp, HapMap, Perlegen

March Product Release Information. About IPA. IPA Spring Release (2016): Release Notes. Table of Contents

From AP investigative Laboratory Manual 1

Elixir: European Bioinformatics Research Infrastructure. Rolf Apweiler

2/10/17. Contents. Applications of HMMs in Epigenomics

BIOINF525: INTRODUCTION TO BIOINFORMATICS LAB SESSION 1

3.1.4 DNA Microarray Technology

Overview of the next two hours...

all official ACS meetings and events without the express written consent from the ACS.

Nature Methods: doi: /nmeth.3732

Ganatum: a graphical single-cell RNA-seq analysis pipeline

GeneWEB Tutorial. Enhancing Biological Research with Gene Networks Bioinformatics Department

Annotation. (Chapter 8)

Gene Network Central (GNC) Pro Tutorial

UAMS ADVANCED DIAGNOSTICS FOR ADVANCING CURE

PATHWAY ANALYSIS. Susan LM Coort, PhD Department of Bioinformatics, Maastricht University. PET course: Toxicogenomics

Research Powered by Agilent s GeneSpring

Microarray Informatics

Cre Portal ( Tutorial

The BioXM TM Knowledge Management Environment: a general and visually driven framework applied to the integration of large biological datasets

The Hunt for New Medicines Uses of Information from the Human Genome: Therapeutic Drug Targets

Stefano Monti. Workshop Format

Microarray Analysis of Gene Expression in Huntington's Disease Peripheral Blood - a Platform Comparison. CodeLink compatible

Lecture 11 Microarrays and Expression Data

Interpreting Genome Data for Personalised Medicine. Professor Dame Janet Thornton EMBL-EBI

Identifying Regulatory Regions using Multiple Sequence Alignments

RNA-Seq Analysis. August Strand Genomics, Inc All rights reserved.

Transcription:

ArrayExpress and Atlas practical: querying and exporting gene expression data at the EBI Gabriella Rustici gabry@ebi.ac.uk This practical will introduce you to the data content and query functionality of ArrayExpress Archive and Atlas. We suggest using Firefox for this tutorial. Additional information on these two resources including dedicated courses and more exercises can be found on the EBI elearning portal, Train Online: http://www.ebi.ac.uk/training/online/: For the Archive see: http://www.ebi.ac.uk/training/online/course/arrayexpress-exploringfunctional-genomics-data-ar For the Atlas see: http://www.ebi.ac.uk/training/online/course/arrayexpress-investigatinggene-expression-pattern-0 Please consider that the results you will obtain while doing the exercises might differ from what illustrated here due to a recent database update. Exercise1 ArrayExpress Archive - High-throughput sequencing example Scenario High-throughput sequencing (HTS) is becoming a popular tool in cancer research to decipher the genetic make-up of a tumor. Mutations, epigenetic mis-regulation and genomic reorganisation are just some of the things that can be studied using this technology. The results obtained from these experiments will provide a new dimension in the study of cancer biology. Imagine that you have just started a project working on human prostate adenocarcinoma and you want to find out all the experiments in the ArrayExpress Archive that use RNA sequencing assays to study this cancer. Task Use the ArrayExpress Archive (http://www.ebi.ac.uk/arrayexpress/) to find relevant experiments. 1

Exercise 2 Expression Atlas - Regulation of transcription Scenario Carcinoma of the prostate is the most frequently diagnosed neoplasm in men in industrialized countries. The androgen receptor (AR), a transcription factor that mediates the action of androgens in target tissues, is expressed in nearly all prostate cancers. During prostatic carcinogenesis, major changes in the androgen receptor pathways occur. Androgen receptor signaling in the nuclei of malignant cells directly stimulates growth of tumor cells. Imagine that you have a mouse model for human prostate carcinoma, trying to elucidate the role of androgen receptor dependent transcription. You want to find out which mouse genes, annotated as members of the androgen receptor signaling pathway, are differentially expressed in prostate carcinoma and which of these are also transcription factors themselves, i.e. involved in regulation of transcription from RNA polymerase II promoter. Task Use the Expression Atlas database (http://www.ebi.ac.uk/gxa/) to search for such genes 2

Exercise 3 Expression Atlas - DiGeorge Syndrome Scenario Imagine you are a scientist working on DiGeorge syndrome, a genetic disease caused by the loss of a portion of chromosome 22 in humans. You have recently discovered that Tbx1, the major gene in the disease, is expressed in mouse brain and so you want to gain information on the expression of TBX1 in the human brain. Task Use the Expression Atlas database (http://www.ebi.ac.uk/gxa/) to find information on TBX1 expression in the human brain. 3

Need some help? Exercise1 1. Open the Archive homepage, http://www.ebi.ac.uk/arrayexpress/. 2. Click on the 'Browse experiments' link to load the more elaborate search bar. 3. In the new page, start typing prostate adenocarcinoma in the experiment search box [A] and select the matching term from the drop down menu. 4. Restrict your search to the organism Homo sapiens [B]. 5. Restrict your search to 'RNA assay' [C] and 'sequencing assay' [D]. You do not need to worry about the All arrays option as it is only used when you want to filter for experiments done on a specific microarray platform (e.g. Affymetrix mouse 3 IVT array). 6. Click Query. The results now show only prostate adenocarcinoma experiments in human, obtained from HTS- based experiments. 7. You can now look more carefully at individual experiments to identify the one that might be more relevant for your research. Let s explore E-GEOD-24284. 8. Click on the '+' sign for experiment E-GEOD-24284 and explore the information available for this experiment. This is an interesting experiment because the investigator performed both microarray and sequencing analyses on the samples. In particular, take a look at the Samples (20) section. Click on the Click for detailed sample information and links to data link (circled in red in the screenshot below) to find out more information about the samples analyzed in this experiment and how they relate to the data files. 4

In the new page that opens up, microarray samples are shown on the top panel and sequencing ones are in the bottom panel (not showing in the panel below). The most interesting columns from the SDRF are shown: The SDRF file is fundamental to interpreting the data associated with the experiment. You can also view the full SDRF files for the experiment E-GEOD-24284 by clicking at the E GEOD 24284.hyb.sdrf.txt, E GEOD 24284.seq.sdrf.txt links towards the bottom of the experiment s page (indicated by red arrow on the previous screenshot). 5

Exercise 2 1. Open the Atlas homepage, http://www.ebi.ac.uk/gxa/. 2. Start typing androgen in the Genes search box [A] and select the suggestion 'androgen receptor signaling pathway from the drop down menu. 3. Restrict your search to the organism Mus musculus [B]. 4. Type prostate in the condition search box [C]. You will see a list of suggested terms in a drop-down list. Hover your mouse cursor over the suggested terms to reveal their EFO IDs. The term you need is prostate carcinoma, [EFO_0001663]. 5. Click Search Atlas [D]. You will get results which look like this: 6. The result of your search are presented as a heatmap view [E] that shows genes belonging to the 'androgen receptor signaling pathway, which were identified as differentially expressed in the condition 'prostate carcinoma' in mouse. The EFO tree has been expanded so you can see child term too, prostate intraepithelial neoplasia. 7. You can now add additional search criteria to your search and search only for the genes involved in regulation of transcription from RNA PolII promoter. 6

8. Click on the 'Advanced search' link [F]. You will be presented with a more elaborate search box: 9. From the 'Gene property' menu [H] select Gene Ontology term. By doing this, you will add an additional search box [G]. 10. In the new search box type 'Regulation of transcription from RNA polymerase II promoter' and select the suggestion for the matching term. 11. Click on 'Search Atlas' to refresh your results. The results now show only genes differentially expressed in mouse prostate carcinoma, annotated as members of the androgen signaling pathway and which are involved in the regulation of transcription. They should represent an interesting set of genes for your project. 12. Click at the heatmap cells to explore the studies which identified the genes as differentially expressed under your queried conditions. You can also click at individual gene to explore its wider expression pattern (e.g. in different cell types, tissues). 7

Exercise 3 1. Open the Atlas homepage, http://www.ebi.ac.uk/gxa/. 2. Since you are interested in TBX1 expression, you could type in the 'Genes search box' 'TBX1' and select, from the suggestions, the matching term for the human gene, ENSG00000184058. Please ignore the suggestion for TBX1 (LRG_226). 3. Click on 'Search Atlas'. 4. Now you have the gene view page for TBX1. Click on the heatmap cell corresponding to 'brain' in the human anatomogram. 5. Click on the 'expression profile' graph for the experiment E-AFMX-6. 6. Now you can explore the experiment to find information on the differential expression of TBX1in the human brain. 8

The experiment view page allows you to visualize the changes in gene expression within a specific experiment. 9