A DNA computing approach to solve Task Assignment problem in Real Time Distributed computing System. Abstract

Size: px
Start display at page:

Download "A DNA computing approach to solve Task Assignment problem in Real Time Distributed computing System. Abstract"

Transcription

1 A DNA computng approach to solve Task Assgnment problem n Real Tme Dstrbuted computng System Aser Avnash Ekka Department of Computer Scence & Engneerng NIT Rourkela, ORISSA, INDIA Bbhudatta Sahoo Department of Computer Scence & Engneerng NIT Rourkela, ORISSA, INDIA Abstract Early attenton has focused on DNA because ts propertes are extremely attractve as a bass for a computatonal system. In ths paper, we have proposed a new framework for assgnng task n heterogeneous Real tme dstrbuted system(rtds) ncludng the cost of path connectng the nodes. The proposed approach s based upon a DNA replcaton technque usng fxed length codng to select the computng node to whch task to be assgned n RTDS. 1. Introducton DNA computng s beng establshed as a vable alternatve to solve varous NP complete problems. These alternatve approaches to performng exhaustve search n soluton space are found to be more effcent for varous ntractable problems. The computer and the DNA both use nformaton embedded n smple codng, the bnary software code and the quadruple genomc code, respectvely, to support system operatons. On top of the code, both systems dsplay a modular, mult-layered archtecture, whch, n the case of a computer, arses from human engneerng efforts through a combnaton of hardware mplementaton and software abstracton. A process represents just sequentally ordered actons by the CPU and only vrtual parallelsm can be mplemented through CPU tme-sharng. Whereas process management n a computer may smply mean job schedulng, coordnatng pathway bandwdth through the gene expresson machnery represents a major process management scheme n a DNA. In summary, a DNA can be vewed as a super-parallel computer, whch computes through controlled hardware composton. A comparson between the archtecture of a computer and a cell, wthn whch DNA resdes, has been gven n Fgure 1. Nowadays the research effort n the area of DNA computng concentrates on four man problems: desgnng algorthms for some known combnatoral problems, desgnng new basc operatons of DNA computers, developng new ways of encodng nformaton n DNA molecules and reducton of error n DNA based computatons [1]. So, we are nspred to use DNA computng to solve the assgnment and schedulng problem n dstrbuted system, whch s a NP-Hard problem. In ths paper a DNA based algorthms has been proposed for solvng the task assgnment problem on real tme dstrbuted system. To our best knowledge t s the frst attempt to solve ths problem by molecular algorthms. 1

2 Fgure 1: A smplstc schematc comparson of the archtecture of a computer and a cell. Note that RNA acts as a messenger between DNA Major area of evolutonary computng research focuses on a set of technques nspred by the bologcal scences, because bologcal organsms often exhbt propertes that would be desrable n computer systems. DNA computng s one of the evolutonary computaton technques that, uses teratve process based upon varous technques nspred by bologcal mechansm of evoluton. A DNA algorthm s based on the general scheme of an evolutonary algorthm as outlned n Algorthm 1 n the pseudo-code fashon [2]. Algorthm 1. The general scheme of an evolutonary algorthm n pseudo-code BEGIN INITIALISE populaton wth random canddate solutons; EVALUATE each canddate; REPEAT UNTIL (TERMINATION CONDITION s satsfed) DO 1. SELECT parents; 2. RECOMBINE pars of parents; 3. MUTATE the resultng offsprng; 4. EVALUATE new canddates; 5. SELECT ndvduals for the next generaton; OD END The proposed algorthm uses ths general scheme of an evolutonary algorthm to solve Task Assgnment problem n Real Tme Dstrbuted computng System The rest of the paper s organzed as follows. The next secton dscusses bascs of DNA computng. Secton 3 descrbes Real tme dstrbuted computng system (HDCS) structure and task assgnment problem. DNA Computatonal Model for task assgnment s presented n Secton 4. Secton 5 ncludes the proposed algorthm for task assgnment 2

3 based upon the computatonal model as dscussed n prevous secton. Fnally, conclusons and drectons for future research are dscussed n Secton DNA Computng DNA computng s to use DNA as the platform to compute by means of molecular bology technques for encodng nformaton, generatng potental solutons, and selectng and dentfyng correct solutons [3]. DNA computng, also known as molecular computng, s a new approach to massvely parallel computaton based on groundbreakng work by Adleman. DNA (deoxyrbonuclec acd) s a double stranded sequence of four nucleotdes; the four nucleotdes that compose a strand of DNA are as follows: adenne (A), guanne (G), cytosne (C), and thamne (T); they are often called bases. James Watson and Francs Crck dscovered the famous double helx chemcal structure of DNA n It conssts of a partcular bond of two lnear sequences of bases. Ths bond follows a property of complementarly: adenne bonds wth thamne (A-T) and vce versa (T-A), cytosne bonds wth guanne (CG) and vce versa (G-C). Ths s known as Watson-Crck complementarly. Each DNA strand has two dfferent ends that determne ts polarty: the 3. end, and the 5. end. The double helx s an ant-parallel (two strands of opposte polarty) bondng of two complementary strands. DNA computers work by encodng the problem to be solved n the language of DNA: the base-four values A, T, C and G. Usng ths base four number system, the soluton to any concevable problem can be encoded along a DNA strand lke n a Turng machne tape. Every possble sequence can be chemcally created n a test tube on trllons of dfferent DNA strands, and the correct sequences can be fltered out usng genetc engneerng tools. There are four reasons for usng molecular bology to solve computatonal problems. 1. The nformaton densty of DNA s much greater than that of slcon: 1 bt can be stored n approxmately one cubc nanometer. Others storage meda, such as vdeotapes, can store 1 bt n 1,000,000,000,000 cubc nanometer. 2. Operatons on DNA are massvely parallel: a test tube of DNA can contan trllons of strands. Each operaton on a test tube of DNA s carred out on all strands n the tube n parallel 3. DNA computng s an nterdscplnary feld where: bologsts, computer scentsts, physcs, mathematcans, chemsts, etc. fnd a lot of nterestng problems whch can be appled to both theoretcal and practcal areas of DNA computng. 4. Error tolerance capablty: the DNA exhbts much better error tolerance capablty, resultng n mproved system robustness. Redundancy plays a major role. The DNA have multple mechansms to process the same nformaton. If one fals, there are other ways to ensure that crucal cellular processes, such as programmed cell death, are completed [4]. NP-complete problems have an exponental number of solutons, and whle t s easy to verfy that an nstance s a soluton, t s hard to fnd a correct soluton. To solve ths, a nature nspred technque s helpful that uses bology and chemstry to generate all possble solutons and flter them quckly. Ths s possble through DNA codng of the 3

4 problem as t s possble to store 2 60 bts of nformaton n 1 ml of DNA. Ths massve parallelsm property of DNA can be used to generate and test the possble combnatons, that can be the optmal solutons for a NP-complete problem. 3. Real Tme Dstrbuted Computng System A real tme dstrbuted computng system (DCS) utlzes a dstrbuted sute of dfferent hgh-performance machnes, nterconnected wth hgh-speed real tme communcaton network, to perform dfferent computatonally ntensve applcatons that have dverse computatonal requrements and dead lne. Real tme dstrbuted computng provdes the capablty for the utlzaton of remote computng resources and allows for ncreased levels of flexblty, relablty, and modularty. Resource management sub systems of the DCS are desgnated to schedule the executon of the tasks that arrve for the servce to meet the specfc dead lne. A RDCS envronments are well suted to meet the computatonal demands of large, dverse groups of real tme tasks. We have consdered the DCS where, the real tme tasks are assumed to be ndependent,.e., no communcatons between the tasks are needed. The ndvdual users of the systems are ndependently submttng ther jobs to the central scheduler. The central scheduler operates usng a dynamc scheme, because the arrval tmes of the tasks may be random and some machnes n the sute may go off-lne and new machnes may come on-lne. 3.1 Real Tme Dstrbuted System Archtecture The real tme dstrbuted system archtecture s modeled by a graph G=(V, E) where vertces are nodes, and edges are b-drectonal pont-to-pont communcaton lnks [8,9]. A node N s made of one processor PE, one local memory LM, and at least one communcator CC. A processor PE executes Tasks T sequentally, reads from and wrtes data nto ts local memory MM. A communcator CC cooperates wth another communcator CC j n order to execute sequentally transfers of data stored n the memory (send or receve) between processors through a lnk. Fgure 2. A Real Tme Dstrbuted Computng Systems model 4

5 Fgure 2 gves an example of G, wth four nodes Node 1, Node 2, Node 3 and Node 4, and an nterconnecton network where each node s made of one processor PE, one local memory LM, and a communcator CC. To each processor PE we assocate a lst of tasks (D, C =PE k ), where D s the deadlne and C s the worst-case executon tme (WCET) of the task T on processor PE. Snce the target archtecture s heterogeneous, the WCET for a gven processor can be dstnct on each processor. Nodes have ther own RTOS [10]. 3.2 Workload Model A Drected Acyclc Graph DAG s used to model a real tme tasks wth dependent subtasks. The basc task model s modeled by a set of N perodc tasks T : Π = { T = ( P, D, C ) 1 N} (1) where P s the perod of the task. Each task s released after every P tme unts. For nonperodc tasks, P s represented by the mnmum (or average) separaton tme between two consecutve releases. The dfference n tme between the arrvals of two consecutve nstances of a perodc task s always fxed and s referred to as perod of that task. Although the nter arrval tmes of nstances of a perodc task are fxed, the nter release tme may not be. D s the relatve deadlne, the perod of tme after the release tme wthn whch the task has to fnsh. Tasks can have an arbtrary deadlne. C s the worstcase executon tme of the task at each release C > D,.e. the maxmum tme span between release of a task and end of the response of that release. In real tme systems t s often necessary to determne an upper bound of tme n that the program block s executed. Operatons of T can be only a computaton operaton. The task T s nvoked repeatedly at each perod from the sensors. Equaton 1 gves an example of T. We consder each Task T to consst of a set of subtasks, whch execute serally. For convenence, we denote the set of subtasks of task T as shown n We are consderng tasks as C > D > P. As we consder on-lne real tme dstrbuton schedulng heurstcs, the arrval, dstrbuton, the executon of operatons and communcatons are tme-trggered, each task arrves after a gven perod. Fgure 3. Task Graph for T wth four subtasks We consder each task T to conssts of set of subtasks each havng tmng and precedence constrant [11] as shown n Fgure 3. The subtask n a precedence constrant task st 4 cannot be executed untl the subtasks preceded by t have completed ther executon.e. st 2 and st 3. We have consdered dfferent number of subtask for each task and the dependency s dfferent for each task at each nvocaton at each perod. 5

6 4. DNA Computatonal Model HDCS envronments are well suted to meet the computatonal demands of large, dverse groups of tasks. The problem of optmally mappng (defned as matchng and schedulng) these tasks onto the machnes of a dstrbuted HC envronment has been shown, n general, to be NP-complete, hence DNA replcaton technques can be used to fnd a soluton. Ths approach s based on Adleman s fndng to solve the TSP problem usng DNA soluton technque n 1994 [5]. The technque uses the possble combnatons of vald routes among the ctes by Marge and anneals the two-soluton set. One of the solutons s the ctes DNA molecule and the other as edge DNA molecule. We have use a DNA replcaton technque usng fxed length codng to select the computng node to whch task s to be assgned n RTDS as shown n Fgure 4. The desgnated computng node executes the task generated wth real tme constrants. The approach to ths work starts by relatng each task on the source computng node as salesman [5] and that after the destnaton node s selected the task s transferred to that ste. Table 1 shows the cty and cost fxed length code for fve nodes and costs n RTDS [6]. Fgure 4: Schematc structure of RTDS 4.1 Encodng The encodng scheme, whch ncludes the communcaton cost between a gven par of computng nodes.e., source and destnaton s presented n [6,7]. Table 1 shows the encoded values for the gven RTDS as shown n Fgure 4. Node Sequence 1 A G T C G G 2 G T G G A C 3 C A G T A A 4 T A T G G G 5 A A G G C C Cost 3 A T G A T A 5 G G A T A A 6

7 4.2 Operatons 7 T A C T A A 9 C C T G C A 11 T T A C C A Table 1: Node and cost sequences for the fve node RTDS The basc operatons assocated wth DNA algorthm are usually desgned for selectng whch satsfy some partcular condtons. On the other hand there may be dfferent sets of such basc operatons. The set of operatons used n ths problem are: 1. MERGE: gven test tube TU 1 and TU 2, create a new tube T contanng all strands form TU 1 and TU AMPLIFY: gven tube TU create copy of them. 3. DETECT: gven tube TU return true f TU contans at least one DNA strand, otherwse return false. 4. SEPARATE: gven tube TU and word w over alphabet DNA create two tubes +(TU;w) and -(TU;w), where +(TU;w) conssts of all strand from TU contanng w as a substrng and -(TU;w) conssts of the remanng strands. 5. LENGTH-SEPARATE: gven tube TU and postve nteger n create tube (TU;< n) contanng all strands from TU whch are of length n or less. DNA 6. POSITION-SEPARATE: gven tub TU and word w over alphabet create tube B(TU;w) contanng all strands from TU whch have w as prefx and tube E(TU;w) contanng all strands from TU whch have w as suffx. Each of the above operatons s a result [1,3]. of some standard bochemcal procedure 4. Proposed Algorthm for Task Assgnment Tasks n real tme systems can be ether perodc or aperodc n nature. In real tme systems the functons not just to be logcally correct but also to be wthn deadlne. Tasks are sad to be perodc when they arrve wthn a fxed nterval of tme. The perodc real tme tasks can be descrbed by ts perod, deadlne and the worst-case executon tme. The proposed task assgnment/allocaton scheme for RTDS s conssts of followng four basc steps: Algorthm: Task Assgnment usng DNA 1. Soluton Space Generaton Phase. Encode each computng node and the cost of communcaton wth 6 base strands and the edge between each node wth complementary base. Create copes of them. 7

8 2. Select tnerares that start and end wth the correct nodes. Select strands, whch have start node as the centralzed scheduler and end nodes as the possble connected destnaton node. 3. Select tnerares that contan the least cost of communcaton. Check the cost of the DNA strand by decodng the code sequence of the cost between the source and destnaton node. 4. Select tnerares that gve feasble schedule. Remove the last sx codes of the top strand.e. node N2. Check the feasblty of sendng the task f assgned to t dependng on avalable resources such as avalable CPU tme, deadlne constrant of the task whch may or may not depend on the perod of the task. If the choce s feasble then the assgnment s made else go for the next avalable strand. Ths proposed algorthm s used to desgn a fault tolerant real-tme schedulng scheme for hard real tme task. Ths scheme results multple feasble allocaton schemes, wth out computatonal overhead n comparsons to other teratve evolutonary technques. The proposed scheme s outlned n Fgure 5 (darkened block) as shown n Fgure 5. Fgure 6 depcts the proposed computaton result for transferrng tasks from node N1 to node N2. 5. Concluson The research n DNA computng s n a prmary level. Hgh nformaton densty of DNA molecules and massve parallelsm nvolved n the DNA reactons make DNA computng a powerful tool. Tacklng problems wth DNA computng would be more approprate when the problems are computatonally ntractable n nature. The most challengng task s concern wth effcent representaton of the problem usng four nucleotdes of DNA. Ths s always a open problem for the researchers, to represent a problem so that, each teraton yeld results solutons for the sad problem. Due to ts hgh degree of parallelsm n DNA Computng, that can overcome the dffcultes arse to solve ntractable problem on slcon computers. The proposed deas and methods show promsng results that DNA computng approach can be well suted for solvng such real-world applcaton n the near future. 8

9 Fgure 5: Developng a DNA based RTDS schedule Fgure 6: DNA Computaton for N1 N2 ncludng the cost. 9

10 References: [1] Potr Formanowcz. DNA computng. Computatonal Methods n Scence and Technology, 11(1):11 20, [2] A.E. Eben and J.E. Smth. Introducton to Evolutonary Computng, chapter What s an Evolutonary Algorthm? Natural Computng Seres. Sprnger ISBN , [3] Yang ren Rau and Huey jenn Chang. DNA computng on the mnmum spannng tree problem. Internatonal Symposum on Nanoelectronc Crcuts and Gga-scale Systems (ISNCGS 2004), pages , 2004 [4] Degeng Wang and Mchael Grbskov. Examnng the archtecture of cellular computng through a comparatve study wth a computer. Journal of The Royal Socety Interface, 2(3), June [5] Leonard M. Adleman. Molecular computaton of solutons to combnatoral problems. Scence, 266: , November [6] S.Y. Shn, B.T. Zhang, and S.S. Jun. Solvng travelng salesman problems usng molecular programmng. Proceedngs of the Congress on Evolutonary Computaton, 2: , [7] J Youn Lee, Soo-Yong Shn, Ta Hyun Park, and Byoung-Tak Zhang. Solvng travelng salesman problems wth DNA molecules encodng numercal values. BoSystems, 78:39 47, [8] Xao Qn; Hong Jang and Swanson, D.R, "An effcent fault-tolerant schedulng algorthm for real-tme tasks wth precedence constrants n heterogeneous systems", Proceedngs of Internatonal Conference on Parallel Processng, Page(s): , Aug [9] Yongbng Zhang; Hakozak, K.; Kameda, H.; Shmzu, K., "A performance comparson of adaptve and statc load balancng n heterogeneous dstrbuted systems", Proceedngs of the 28th Annual Smulaton Symposum, Aprl 1995 pp [10] Rajb Mall, Real-Tme Systems, Pearson Educaton 07. [11] I. Gupta, G. Manmaran, and C. Sva Ram Murthy, "Prmary-Backup based faulttolerant dynamc schedulng of object-based tasks n multprocessor real-tme systems," Dependable Network Computng, Kluwer Academc Publshers, 10