Intenational Jounal Scientific and Reseach Publications, Volume 5, Issue, Januay 205 ISSN 2250-353 Common up Regulated and down egulated Genes fo Multiple Cances using Micoaay Gene Expession Analysis Apoova.D *, D.Guumuthy.H ** * Depatment Infomation science and Engg, D.TTIT, India ** Depatment Biotechlogy, G M. Institute Techlogy, India Abstact- Cells ae building blocks living things. Nomal s multiply when body needs them and die when body doesn t need them. Cance appeas to occu when the gowth the in the body is out contol and s divide too quickly. In cance s display uncontolled multiplication, invasion and metastasis and caused by abmalities in genetic mateial tansfomed s. Samples micoaay expeiments pefomed on homo sapiens mal and canceous s wee downloaded fom GEO database and these samples wee impoted to CLC Main Wokbench stwae and expession analysis is pefomed to identify and ank common diffeentially expessed Genes in multiple type cance, by using technique called DNA Micoaay data analysis which is used to find out expession lage numbe genes simultaneously and it povides invaluable infomation on disease pathology, pogession, esistance to teatment and theapeutic appoaches fo cance. These genes found ae useful fo dug design as they act as biomakes and also can be used in futhe analysis fundamental signal tansduction pathways that lead to cacimas, since most genes causes cance which ae esponsible fo causing othe cance e.g. gene causing beast cance have chances casing ovaian cance hence by finding such genes pevention getting multiple cance can be done. Index Tems- GEO database, CLC Main Wokbench, Diffeentially Expessed Genes, Micoaay C I. INTRODUCTION ance is a categoy disease in which apid ceation abmal s gow beyond thei usual boundaies, and which can then invade adjoining pats the body and spead to othe ogan. Cances ae caused by abmalities in s which may be due to affects cacigens, such as tobacco smoke, adiation, chemicals, o infectious agents. Othe cance pomoting genetic abmalities may andomly occu though eos in DNA eplication, o inheited. The heitability cances is usually affected by complex inteactions between cacigens and the host s geme. Since cances can occu due to gene mutation in the s finding those genes can be helpful. Thee ae lage genes pesent in the so thee is one popula technique called micoaay techlogy to find out expession lage numbe genes simultaneously. DNA Micoaay is a collection DNA spots attached on solid suface this is kwn as Affymetix chip. Each DNA spot has specific sequence and is called as pobe. Micoaay methods wee initially developed to study diffeential gene expession using complex populations RNA. Refinement these methods w pemits the analysis copy numbe imbalances and gene amplification DNA. Figue : Micoaay techlogy Fo the eseach ten type cances wee selected they ae Ovaian, Lung, Panceatic, Gastic, Live, Thyoid, Salivay Gland, Pituitay cance. These vaious types cances wee selected based on the statistical data obtained fom authentic souces like GEO and the supplementay infomation published manuscipts. Since, peviously common up egulated genes and down egulated genes wee identified fo multiple types cances this eseach concentates on identifying the common up egulated and down egulated genes by pefoming expession analysis using CLC Main Wokbench. Pefoming statistical analysis fo lage genes is vey had and causes eo but by using CLC Main Wokbench stwae it is vey easy to pefom statistical analysis. II. MATERIALS AND METHOD Some the databases and tools used ae: A. Gene Expession omnibus: it is a public epositoy that achives and distibutes Micoaay, next geneation sequencing, and othe foms functional gemic data which is submitted by scientific community. It is a micoaay database that allows uses to download expeiments and cuated gene expession piles povided by NCBI (http://www.ncbi.nlm.nih.gov). The datasets fo diffeent cances ae downloaded fom this database
Intenational Jounal Scientific and Reseach Publications, Volume 5, Issue, Januay 205 2 ISSN 2250-353 in ode to pefom expession analysis. The downloaded datasets ae stoed in ZIP/winRAR fomat the link fo GEO database is http://www.ncbi.nlm.nih.gov/geo. The data in GEO database is oganized into platfom, samples, seies and datasets. Platfom is composed summay desciption aay and sequence and, fo aay based platfom, a data table defining aay template. Each platfom ecod is assigned a unique and stable accession numbe (GPLxxx). A platfom may efeence samples submitted by submittes. Sample ecods descibe conditions unde which the sample was handled, a manipulation is undegone, and abundance measuement each element deived fom it. Each sample is assigned a unique and stable accession numbe (GSMxxx) sample entity must efeence only one platfom and may include multiple seies. Seies ecod links to goup elated samples and povides focal point and discussion on whole studies. Each seies data has unique and stable accession numbe (GSExxx). Dataset is a cuate collection biologically and statistically compaable GEO samples and foms the basis GEO s suite data display and analysis tool. Samples within the dataset efe to same platfom i.e. they shae a common set aay elements. B.CLC Main Wokbench: It is the gaphical use inteface and the functions CLC main Wokbench ae used by thousands eseaches fo DNA, RNA and potein sequence analyzing. Such as gene expession analysis, pime design, molecula cloning, phylogenetic analyses, and sequence data management. It is available on windows, MAC OS X, and Linux CLC Main Wokbench has Navigation aea, View aea, Menu Ba, Toolba, Status Ba and Toolbox. Navigation aea is located in the left side the sceen, unde the toolba. It is used fo oganizing and navigating data. Its behavio is simila to the way files and foldes ae usually displayed on compute. The data in navigation aea is oganized into numbe locations. When it is stated fo fist time, thee is one location called CLC_Data. Data can be added to navigation aea in a numbe ways. Files can be impoted fom the file system o by dagging it into the navigation aea. View aea is the ight hand pat the sceen, displaying cuent wok. The View aea may consists one o moe views, epesented by tabs at the top the view aea. Tool box and Status ba: the toolbox is placed in the left side the use inteface CLC Main Wokbench below the navigation aea. The toolbox shows a pocesses tab and a toolbox tab. By clicking the pocesses tab the toolbox displays the pevious and unning pocesses. The tools in the toolbox can be accessed by double clicking o by dagging elements fom the navigation aea to an item in the toolbox. The status ba is located at the bottom the window. In the left side the ba is an indication whethe the compute is making calculations o whethe it is idle. The ight side the status ba indicates the ange selection a sequence. Wokspace: if we ae woking on a poject and have aanged the views fo the poject, we can save this aangement using wokspaces. The wokspace emembes the way we have aanged the views, and can switch between diffeent wokspaces. The method begins with etieving the datasets fom the GEO database afte etieving datasets expession analysis is caied out using CLC main Wokbench. Fist, the datasets ae impoted into CLC main Wokbench by clicking impot in the toolba and file is selected. Samples ae stoed in navigation aea. The next step is to tell the CLC Main Wokbench how the samples ae elated this is done by setting up an expeiment. An expeiment is the cental data type when analyzing expession data in the CLC Main Wokbench. It includes a set samples and infomation about how the samples ae elated. The expeiment is also used to accumulate calculations like t-tests and clusteing. Afte setting up an expeiment an expeiment table will be opened. The table includes the expession values fo each sample and in addition a few exta values such as the ange, intequatile ange, fold change and diffeence values. The expeiment is saved and can poceed to expession analysis. Next Quality contol is pefomed. Fist MA plot is ceated since MA plot compaes two samples, select two the aays and ceate a plot. Next select same two aays used fo the plot, choose log 2 tansfomation and ceate a MA plot again. This will esult in a quite diffeent plot. The next step is to tansfom the expession values within the expeiment, since this is the data we ae going to use in futhe analysis. If the table is opened all the samples have an exta column with tansfomed expession values thee is also an exta column fo goup mean and tansfomed IQR. The next step is to examine and compae the oveall distibution the tansfomed expession values in the sample so Box plot is ceated. The next step in quality contol is to check whethe the oveall vaiability the samples eflect the gouping so pinciple component analysis is pefomed. In ode to complement the pinciple component analysis hieachical clusteing samples is done to see if the samples cluste in the goups we expect. Next step is to identify and investigate the genes that ae diffeentially expessed. Some statistical test is caied out that will be used to identify the genes that ae diffeentially expessed between the two goups. The tansfomed values FDR p-value coection is used to efine the genes that only have value below 0.0005.next one last citeion is added to the filte that is diffeence should have absolute value highe tha. Next step is to pefom antation test in which the gene list is antated and use the antation to see if thee is a patten in the biological antations genes in the list candidate diffeentially expessed genes. Two types antation methods ae used: Hype geometic Tests on antations and Gene Set Enichment Analysis (GSEA). Fist step is to impot an antation file used to antate the aays. The antation file can be downloaded fom the website http://www.affymetix.com/suppot/technical/antationfilesmai n.affx.signing The antation file is impoted to CLC Main Wokbench. The add antation is selected in antation test pesent in expession analysis toolbox the expeiment and the antation file is selected and next and finish is clicked. Next in toolbox hype geometic test on antation is selected pesent in antation test within expession analysis. The two expeiments ae selected and next is clicked. Go biological pocess and
Intenational Jounal Scientific and Reseach Publications, Volume 5, Issue, Januay 205 3 ISSN 2250-353 tansfomed expession values ae selected and next, finish is clicked. And the test is pefomed. Next select Gene Set Enichment analysis pesent in antation test within expession analysis toolbox. The oiginal full expeiment is selected and next is clicked. Tansfomed expession value is selected and finish is clicked. III. RESULTS A.Diffeentially Expessed Genes. Statistical analysis will be done to identify genes that ae diffeentially expessed between the two goups. The two coected p-values, bonfeoni coected and FDR coected paametes ae selected. Fo the analysis FDR p-value is used which is a measue that allows us to contol how big a popotion false positives (genes that we think ae diffeentially expessed but eally ae t) we ae willing to accept. To do moe efined selection the genes that we believe to be diffeentially expessed, advanced filteing is used which is located at the top the expeiment table. Tansfomed-FDR p- value coection is selected in the fist dop-down box, select < in the next and ente 0.0005(o 0.005 depending on locale settings). Declaing 0.0005 means we ae setting specificity to be 95. In diagonistic testing when the disease pevalence is small, we need a test with vey high specificity, as othewise thee ae too many false positive esults. B.Antation test. The gene list will be antated and used to see if thee is a patten in the biological antations the genes in the list candidate diffeentially expessed genes. Table : The numbe up egulated and down egulated genes obtained fo the cances Type cances up egulated Colon cance 5 0 2 Beast cance 204 24 3 Gastic cance 560 4 4 Live cance 4 3 5 Lung cance 20 22 6 Ovaian cance 949 405 Panceatic cance 6 3 down egulated 9 Pituitay cance Salivay gland cance 3 42 39 33 0 Thyoid cance 900 40 Table 2: Shows up egulated common genes. GO id 995 2 400 3 42493 4 6955 5 604 6 200 Types Cances beast, panceatic thyoid salivay 064 live. Desciption Lipid stoage. esponse to oganic cyclic substance esponse to dug immune esponse tetiay banching involved in mammay gland duct mophogene sis egulation actin polymeizati on o depolymeiz ation epithelial matuation Gene name TRBC T beta constan t pogest eone 2 2 pogest eone
Intenational Jounal Scientific and Reseach Publications, Volume 5, Issue, Januay 205 4 ISSN 2250-353 504 9 33603 0 66 2 3 4 pogesteone signaling pathway positive egulation dopamine secetion ameboidal migation pogest eone 2 2 2 2 Table 3: Shows down egulated common genes. GO id 69 5 6 0 30 54 50 3 5 22 Types cances panceatic. Desciption apoptosis tanspot diffeentiati on potein tanspot negative egulation Gene name IL9 inteleuki n 9 PDZK PDZ domain containin g Ifd intefeon -elated developm ental egulato Napb N- ethylmale imide sensitive fusion potein attachme nt potein beta Ctnnbip catenin beta 6 9 0 62 64 2 6 92 639 646 sal ivay, tansciption fom RNA polymease II. DNA epai tanslation vesiclemediated tanspot mrna pocessing potein phosphoyla tion IV. CONCLUSION inteactin g potein TOP2A topoisom ease (DNA) II alpha 0kDa Akt thymoma vial potooncogene Gsn gelsolin Ssf2 seine/ag inine-ich splicing facto2 Akt thymoma vial potooncogene The wok compises investigating the common genes esponsible to cause a staggeing vaiety cance though thei diffeential expession pattens using a technique called DNA Micoaay. As an oveview entie pocess, elevant data fom GEO is obtained, tabulated them and subjected them to analysis and found common diffeentially expessed genes. Futue scope is that the identified set genes which ae common and diffeentially expessed in multiple cances might be useful in the futhe analysis fundamental signal tansduction pathways that lead to cacimas, so that these genes can act as biomakes fo dug design. Since most the cances ae caused by genes which ae aleady esponsible to cause othe cance e.g. women having beast cance have chances getting ovaian cance, so it can be pevented by doing gene theapy on common genes esponsible to cause cance.
Intenational Jounal Scientific and Reseach Publications, Volume 5, Issue, Januay 205 5 ISSN 2250-353 REFERENCES [] http://www.ncbi.nlm.nih.gov/pubmedhealth. [2] Jain N, Thatte J, Baciale T, Ley K, O Connell M and Lee JK. Local- Pooled-eo test fo Identifying Diffeentially Expessed genes with a small numbe eplicated micoaays. Oxfod Jounals Bioinfomatics 2003; 9: 945-95. [3] http://www.ncbi.nlm.nih.gov/geo. [4] http://www.clcbio.com/poducts/clc-main-wokbench. [5] http://www.clcbio.com. [6] Yudi Pawitan and Stefan Michiels. False Discovey ate, sensitivity and sample size fo micoaay studies 2005; : -2. AUTHORS Fist Autho Apoova.D, B.E, M.TECH, DR.TTIT, India, apoova.bhavikatte@gmail.com. Second Autho D.Guumuthy.H, M.Sc, PGD (Bioinfomatics),PhD., MISTE., G M. Institute Techlogy, India.