Tag Anti-Collision Resolution for Improved Quality of RFID Data Streams. Prapassara Pupunwiwat

Size: px

Start display at page:

Download "Tag Anti-Collision Resolution for Improved Quality of RFID Data Streams. Prapassara Pupunwiwat"

Charlene Dorsey
6 years ago
Views:

Tag Anti-Collision Resolution for Improved Quality of RFID Data Streams Prapassara Pupunwiwat BIT (Hons) School of Information and Communication Technology Science,

1 Tag Anti-Collision Resolution for Improved Quality of RFID Data Streams Prapassara Pupunwiwat BIT (Hons) School of Information and Communication Technology Science, Environment, Engineering and Technology Griffith University A Thesis submitted in fulfillment of the requirements of the degree of Doctor of Philosophy September 2011

3 Abstract Radio Frequency Identification (RFID) is a technology that allows automatic identification of people or objects by incorporating the use of radio frequency waves to transmit data between networked electromagnetic readers and tags. RFID is considered an emerging technology for advancing a wide range of applications, such as supply chain management and distribution. However, despite the extensive development of the RFID technology in many areas, the RFID tags collision problems remain a serious issue. Collision problems occur due to the simultaneous presence of multiple numbers of tags within the reader zone. To solve collision problems, different anti-collision methods have been mentioned in literature. These methods are either insufficient or too complex, with a high overhead cost of implementation. In this work, in order to improve the quality of RFID data collection, we propose novel deterministic and probabilistic anti-collision approaches. The main contributions of this study are summarised as follows: We propose two novel deterministic anti-collision algorithms using combinations of Q-ary trees (Pupunwiwat and Stantic, 2009a,b, 2010c), with the intended goal to minimise memory usage queried by the RFID reader. By reducing the size of queries, the RFID reader can preserve memories, and the identification time can be improved. We propose a novel frame-size estimation technique (Pupunwiwat and Stantic, 2010a,b) to minimise the number of slots and frames queried by the RFID reader and to maximise the system efficiency. In addition, we introduce the probabilistic group-based anti-collision method (Pupunwiwat and Stantic, 2010d) to improve the overall performance of the tag recognition process. We evaluate our proposed anti-collision techniques and perform a comparative analysis, in order to find the benefits and disadvantages of each method. Additionally, in order to identify the best selection of anti-collision method, we propose two strategies for selective anti-collision technique management, i.e. a Novel Decision Tree Strategy and a Six Thinking Hats Strategy (Pupunwiwat et al., 2011). By correctly identifying the most suitable anti-collision method for specific scenarios, the quality of data collection can be improved.

5 Statement of Originality This work has not previously been submitted for a degree or diploma in any university. To the best of my knowledge and belief, the thesis contains no material previously published or written by another person except where due reference is made in the thesis itself. Signed: Prapassara Pupunwiwat September 19, 2011

7 Acknowledgements First and foremost, I would like to thank my supervisor Dr Bela Stantic, for his extraordinary support and encouragement. During the past four years, I learned a lot from Dr Stantic about good research works and what it takes to accomplish them. He directed me though this journey and I would not be where I am now without his guidance. I would also like to thank Professor Abdul Sattar, for his valuable comments on my research. My thank also to the School of Information and Communication Technology for providing such a good research environment and allowing me to gain academic skills in casual tutoring. I would like to specifically thank Mrs Rohana Wendt for her precious feedback on the writing aspect of my thesis. Also, I would like to thank all my colleagues and university friends for their mental support, especially Mr Peter Darcy who helped me through difficult times. My thank also to my childhood friends and my brother, who never fail to cheer me up when I least expected them. Also, my special gratitude to Mr Herman Wendt, who consoled me and being there for me when I feel distressed, and for understanding my need to devote most of my time to my thesis. Finally, my special appreciation and thanks to my mum and dad, who have been a tower of strength throughout my studies, and for their support and understanding. I would never become who I am now without their dedication and love.

9 List of Publications List of Book Chapter P. Pupunwiwat and B. Stantic, (2012). Managing Tag Collision in RFID Data Streams using Smart Tag Anti-Collision Techniques. Chipless and Chipped Radio Frequency Identification: Systems for Ubiquitous Tagging, IGI Global, (in press). P. Darcy, P. Pupunwiwat and B. Stantic, (2012). The Fusion of Pre/Post RFID Correction Techniques to Reduce Anomalies. Intelligent Sensor Networks: Across Sensing, Signal Processing, and Machine Learning, CRC Press, (in press). P. Darcy, P. Pupunwiwat, and B. Stantic, (2011). The Challenges and Issues facing the Deployment of RFID Technology. Deploying RFID Challenges, Solutions, and Open Issues (InTech2011), Rijeka, Croatia, InTech, Pages List of Journal P. Pupunwiwat, P. Darcy, and B. Stantic, (2011). Conceptual Selective RFID Anti-Collision Technique Management. Procedia Computer Science, volume 5, Ontario, Canada, ELSEVIER, Pages P. Pupunwiwat and B. Stantic, (2007). Location Filtering and Duplication Elimination for RFID Data Streams. The International Journal of Principles and Applications in Information Science and Technology (PAIST), volume 1, Auckland, Albany, New Zealand, PAIST Press, Pages

10 List of Conferences P. Pupunwiwat and B. Stantic, (2010). Joined Q-ary Tree Anti-Collision for Massive Tag Movement Distribution. Thirty-Third Australasian Computer Science Conference (ACSC 2010), Brisbane, Australia, Pages P. Pupunwiwat and B. Stantic, (2010). Dynamic Framed-Slot ALOHA Anti-Collision using Precise Tag Estimation Scheme. Twenty-First Australasian Database Conference (ADC 2010), Brisbane, Australia, Pages P. Pupunwiwat and B. Stantic, (2010). A RFID Explicit Tag Estimation Scheme for Dynamic Framed-Slot ALOHA Anti-Collision. Sixth Wireless Communications, Networking and Mobile Computing (WiCOM 2010), Chengdu, China, Pages 1-4. P. Pupunwiwat and B. Stantic, (2010). Resolving RFID Data Stream Collisions using Set-Based Approach, Sixth International Conference on Intelligent Sensors, Sensor Networks and Information Processing (ISSNIP 2010), Brisbane, Australia, Pages P. Pupunwiwat and B. Stantic, (2009). Unified Q-ary Tree for RFID Tag Anti-Collision Resolution. Twentieth Australasian Database Conference (ADC 2009), Wellington, New Zealand, Pages P. Pupunwiwat and B. Stantic, (2009). Performance Analysis of Enhanced Q-ary Tree Anti-Collision Protocols. First Malaysian Joint Conference on Artificial Intelligence (MJCAI 2009), Kuala Lumpur, Malaysia, Pages

11 Contents 1 Introduction 1 2 RFID Background History of RFID RFID Fundamentals Elements of Radio Frequency Communication Radio Spectrum Interference and Multipath of RF waves RFID Technology Overview RFID System Mechanism Characteristic of RFID data Main RFID Commercial Applications RFID Technology in Supply Chain RFID Core Components RFID Antenna RFID Reader RFID Tag RFID Data Management Issues Data Capturing Process Data Processing and Event Management Data Warehousing and Data Mining Summary

12 3 RFID Data Streams Management Techniques Filtering of RFID Data Streams Unreliable Reads Noises Duplications Missed Reads Collision Handling in RFID Data Streams RFID Collision Types Division Classification for Multi-Access Taxonomy of RFID Tag Anti-Collision Protocols Deterministic Anti-Collision Protocols Binary Search Bit Arbitration Tree Splitting Query Tree Probabilistic Anti-Collision Protocols BFSA Method DFSA Method EDFSA Method Other ALOHA-Based Methods Backlog Estimation Techniques Discussion Limitations of Existing Methods Research Problem Summary Deterministic Anti-Collision Approaches EPC Encoding Schemes Analysis General Identifier 96 Bits Serialised Global Trade Item Number 96 Bits Global Individual Asset Identifier 96 Bits Warehouse Distribution Scenarios

13 4.2.1 Unique Item-Level Scenario Unique Container-Level Scenario Unique Company-Level Scenario Unique Warehouse-Level Scenario Splitting Fitness Worst-Case Splitting Perfect Splitting Random Splitting Unified Q-ary Tree Unified Q-ary Tree Fundamental Computation of Naive approach and Unified approach Experimental Evaluation Results Joined Q-ary Tree EPC Bits Prediction and Classification Unique Bits Computation Tags Splitting Experimental Evaluation Results Overall Analysis Summary Probabilistic Anti-Collision Approaches Mathematic Fundamental for ALOHA-based Tag Estimation Precise Tag Estimation Scheme Slot Observation and Initial Q Value Suggested Threshold for Frame-Size PTES approaches Sample Tag Estimation and Allocation Experimental Evaluation Results Probabilistic Cluster-Based Technique

14 5.3.1 Probabilistic Anti-Collision Algorithm using PTES PCT Preliminary Sample Boundary Computation PCT Rules Experimental Evaluation Results Overall Analysis Summary Conceptual Selective Technique Management Chain Reaction from Data Collection Process Comparative Analysis of Deterministic and Probabilistic Techniques Data sets Comparative Analysis Strategies for Choosing Suitable Anti-Collision Techniques Novel Decision Tree for Anti-Collision Methods Selection Extended Solution for Complex Anti-Collision Methods Selection Six Thinking Hats for Complex Anti-Collision Methods Selection Applicability of Anti-Collision Techniques in Real World Scenario Wine Warehouse Tag-and-Ship Scenario Document Warehouse Scenario Summary Conclusions Summary of Contributions Future Works Bibliography 160

15 List of Figures 2.1 RFID Operational Frequencies An example of: a) Refraction, b) Reflection, and c) Scattering An example of how RFID tag, reader, middleware and application operate An example of RFID-enabled Supply Chain System An example of: a) Simple antenna pattern, and b) Antenna pattern containing protrusions Proper tag orientation for a linearly polarised antenna Proper tag orientation for a circularly polarised antenna Various types of anti-collision methods An example of: a) Passive Tag, b) Semi-passive/Semi-active Tag, and c) Active Tag An example of three readers deployment, where R1 and R2 covered S1, and R2 and R3 covered S Collision Problems in RFID System: a) Reader-Reader Collision, b) Reader- Tag Collision, and c) Tag-Tag Collision Taxonomy of RFID Readers anti-collision protocols Taxonomy of RFID Tags anti-collision protocols Binary Tree Memory based anti-collision protocol Query Tree Memoryless based anti-collision protocol The starting point of tag identification in tree-based protocols Tree-based protocols: a) Query tree protocol, b) 4-ary tree protocol A sample procedure of Frame-slotted ALOHA Empty Slot, Successful Slot, and Collision Slot in EPC Class 1 Generation 2 Protocol

16 4.1 Crystal Warehouse Scenario: a) Unique Item-Level, b) Unique Container- Level, c) Unique Company-Level, and d) Unique Warehouse-level Splitting Fitness: a) Worst-Case Splitting, b) Perfect Splitting, and c) Random Splitting A sample of 16 tags from the same pallet with the same Object Class and 16 unique Serial Numbers A sample of: a) Naive 4-ary Tree, and b) Unified 4-ary & 8-ary Tree Identification processes of: a) Naive 2-ary Tree, b) Naive 4-ary Tree, c) Unified 2-ary & 4-ary Tree, and d) Unified 4-ary & 2-ary Tree, Level-Packaging: a) a case with 6 glasses, and b) a pallet with 27 cases Performances of Naive Q-ary Trees on different set of tags Performances of sixteen combination of Q-ary Trees (4 Naive and 12 Unified) where F = 36 and S = Results of two Naive approaches (2-ary, 4-ary) and two Unified approaches (2-ary & 4-ary, 4-ary & 2-ary) for number of Idle cycles, Collision cycles, Successful cycles, and Overall cycles Results of two Naive approaches (2-ary, 4-ary) and two Unified approaches (2-ary & 4-ary, 4-ary & 2-ary) for Number of bits queried for Idle cycles, Collision cycles, Successful cycles, and Overall cycles Performance Analysis of 2-ary Tree vs. 4-ary Tree on Unique bits of EPC, Bit 61-68, until all tags are identified. Results of Overall cycles are displayed A sample of: a) a Naive 4-ary Tree, b) a Naive 2-ary Tree, and c) a Joined Q-ary Tree Joined Q-ary Tree structure for GID-96 bits EPC Joined Q-ary Tree structure for GID-96 bits EPC with 36 Identical bits Header and GMN, 20 Identical bits OC, 4 Unique bits OC, 30 Identical bits SN, and 6 Unique bits SN Performances comparison (GID-96) between Naive approaches and Joined Q-ary approach Percentage of improvement (GID-96) of Joined Q-ary Tree compared with Naive 2-ary Tree and Naive 4-ary Tree Accumulative Bits Length (GID-96) of three approaches: a) Naive 2-ary Tree, b) Naive 4-ary Tree, and c) Joined Q-ary Tree on different tag sets Performances comparison between Naive approaches and Joined Q-ary approach using different Encoding Scheme: a) GID-96 bits, b) SGTIN-96 bits, and c) GIAI 96 bits

17 4.19 Percentage of improvement of Joined Q-ary Tree compared with Naive 2- ary Tree and Naive 4-ary Tree, using different Encoding Scheme: a) GID-96 bits, b) SGTIN-96 bits, and c) GIAI 96 bits Variable V 1 and V 2 for Collision slot and Empty slot calculation for PTES[CE] method. There are ninety-nine possible combinations of V 1 and V 2, in order to find optimal parameters for c and e prediction A sample first round of tag allocation with Initial Q of 4. Collision slot c = 7, Empty slot e = 4, and Successful slot s = A sample of Q-adjust in each round of identification until all tags are identified A sample second round of tag allocation with Initial Q of 4, V 1 = 2.0, and V 2 = 0.5. Collision slot c = 2, Empty slot e = 5, and Successful slot s = A sample third round of tag allocation with Initial Q of 3, V 1 = 2.0, and V 2 = 0.5. Collision slot c = 1, Empty slot e = 3, and Successful slot s = A sample fourth (final) round of tag allocation with Initial Q of 2, V 1 = 2.0, and V 2 = 0.5. Collision slot c = 0, Empty slot e = 2, and Successful slot s = Performance efficiency of PTES[C], Sch, and LB methods, using different Initial Q: a) PTES[C] 200 tags and b) PTES[C] 300 tags Performance efficiency of PTES[CE], PTES[CCE], Sch, and LB methods, using different Initial Q: a) PTES[CE] 200 tags, b) PTES[CE] 300 tags, c) PTES[CCE] 200 tags and d) PTES[CCE] 300 tags Performance efficiency (a) and Number of frames (b) of PTES[C] (V 1 = 2.3 to 2.5) versus Sch methods, using Initial Q of 8 on different tag sets Results of PTES[CE] and PTES[CCE] (V 1 = 2.3, V 2 = 0.1) versus Sch methods using Initial Q of 8 on different tag sets: Performance efficiency (a: PTES[CE], c: PTES[CCE]) and Number of frames (b: PTES[CE], d: PTES[CCE]) Performance efficiency of different frame-size on different number of tags The minimum and maximum boundaries and their correlated percentage of efficiency for frame-size of Number of slots comparison (a) and Performance efficiency (b) for DFSA, EDFSA, PCT128, PCT256, and PCT-E methods on different number of tags Percentage of improvement of PCT compared with DFSA and EDFSA methods

18 6.1 Comparative analysis of deterministic versus probabilistic anti-collision methods: a) Number of slots comparison and b) Performance efficiency Novel Decision Tree for Anti-Collision Methods Selection Novel Decision Tree for Local Pen Maker Company (SME) Anti-Collision Methods Selection Novel Decision Tree for Local Notebook Manufacturer (SME) Anti-Collision Methods Selection Novel Decision Tree for International Stationery Enterprise Anti-Collision Methods Selection Novel Decision Tree for International A-Grade Filing and Storage Group Anti-Collision Methods Selection Six Thinking Hats Framework Six Thinking Hats: Global Trading Enterprise (GTE) Scenario Wine Warehouse Tag-and-Ship Scenario Document Warehouse Scenario

19 List of Tables 2.1 The Uniform Resource Identifier (URI) encoding complements the EPC Tag Encodings defined for use within RFID tags and other low-level architectural components A sample of noise where * indicates a noise reading. Since the noise threshold equals to 3 and the tag catch is only for GID encoding, any tag that appears less than three times within a specific time frame or does not satisfy tag catch requirement, is classified as noise A sample of data duplication, where TagE is captured twice and TagF is captured three times A sample of Missed reads where at time 500msec, 800msec and 1000msec, readings of TagA are missing Identification process of Query Tree versus Hybrid Query Tree EDFSA Rule - The number of unread tags, optimal frame-size, and number of group The GID-96 includes three fields in addition to the Header, with a total of 96-bits binary value. Only H is shown in Binary, while the rest are shown in Decimal The SGTIN-96 includes six fields with a total of 96-bits binary value. Only H is shown in Binary, while the rest are shown in Decimal SGTIN-96 and GIAI-96 Partitions in bits The GIAI-96 includes five fields with a total of 96-bits binary value. Only H is shown in Binary, while the rest are shown in Decimal The Unified Q-ary Tree can be merged into twelve different combinations. 1, 2, 3, and 4 represent the Number of bits queries each time for splitting tags when collision occurred

20 4.6 Calculation of Total memory bits required for two Naive and two Unified Q- ary Trees. TNBL shows the Total Number of Bits required for the specific Q-ary Tree Sample Outcomes for 5 tags identification using Naive and Unified approaches This Table shows Total Memory Bits required for each Q-ary Tree for 192 tags set identification Performance Analysis of 2-ary Tree vs. 4-ary Tree on Unique bits of EPC, Bit 61-68, until all tags are identified Formal structure of bits classification of EPC GID-96 bits. *UOC is number of Unique bits within Object Class and **USN is number of Unique bits within Serial Number Sample bits classification of EPC GID-96 bits, where Object Class = 12 and Serial Number = 60 (Total of 720 tags) Sample 36 bits tags with 24 Identical bits and 8 Unique bits Identification process of 2-ary Tree and 4-ary Tree on Identical bits and Unique bits of EPC Calculation of Total Bits Length required for two Naive Q-ary Trees and a Joined Q-ary Tree. TNBL shows the Bits Length required for the specific Naive/Joined Q-ary Tree Performance Analysis of Naive 2-ary Tree, Naive 4-ary Tree, and Joined Q-ary Tree on set of 10 sample tags Chosen EPC Pattern for Experiment One Identical and Unique Bits classification of EPC GID-96 bits for Experiment one - Test case A, B, and C. I = Identical bits, U = Unique bits Actual Separating Point for Experiment one - Test case A, B, and C. At a specific SP, Joined Q-ary Tree will adjust its branch to either 2-ary or 4-ary Tree Chosen EPC Pattern for Experiment Two Percentage improvement of the proposed Joined Q-ary Tree versus existing Naive 2-ary (N2) and Naive 4-ary (N4) approaches Accumulative Bits Length of three approaches: Naive 2-ary Tree, Naive 4-ary Tree, and Joined Q-ary Tree Number of bits length of three approaches using different Encoding Scheme: a) GID-96 bits, b) SGTIN-96 bits, and c) GIAI 96 bits

21 4.23 Percentage of improvement of Joined Q-ary Tree versus two Naive approaches using different Encoding Scheme: a) GID-96 bits, b) SGTIN-96 bits, and c) GIAI 96 bits Suggested frame-size boundary (B) and minimum and maximum number of tags (NT) for specific estimated number of tags PTES methods comparison Sample tag estimation and frame-size (Q) adjustment after the first round of identification, using PTES[C] method Sample tag estimation and frame-size (Q) adjustment after the second round of identification, using PTES[C] method Sample tag estimation and frame-size (Q) adjustment after the third round of identification, using PTES[C] method Sample tag estimation and frame-size (Q) adjustment after the fourth round of identification, using PTES[C] method Sample tag estimation and frame-size (Q) adjustment after the fifth round of identification, using PTES[C] method Chosen Parameters for Experiment One Chosen Parameters for Experiment Two Performance efficiency of PTES[C], Sch, and LB methods, using Initial of 8 on different sets of tags Performance efficiency of PTES[CE], PTES[CCE], Sch, and LB methods, using Initial of 8 on different sets of tags Available Information and Missing fields on System Efficiency. MinB = Minimum point of occurrence, MaxB = Maximum point of occurrence Derived Equations for Missing fields on System Efficiency. MinB = Minimum point of occurrence, MaxB = Maximum point of occurrence The conversion of PCT rules to β Beta, κ Kappa, and µ Mu PCT256 Boundary Computation - number of group (Frame-Size 256 and 128), and minimum and maximum boundaries PCT256 Rule - The number of unread tags, optimal frame-size (A and B), and number of group (A and B) PCT128 Boundary Computation - number of group (Frame-Size 128 and 64), and minimum and maximum boundaries PCT128 Rule - The number of unread tags, optimal frame-size (A and B), and number of group (A and B)

22 5.19 PCT-E Boundary Computation - number of group (Frame-Size 256, 128 and 64), and minimum and maximum boundaries PCT-E Rule - The number of unread tags, optimal frame-size (A, B, C), and number of group (A, B, C) Chosen Parameters for Experiment Three Number of slots comparison and Performance efficiency for DFSA, EDFSA, PCT128, PCT256, and PCT-E methods on different number of tags Percentage improvement of the proposed PCT128, PCT256, and PCT-E versus existing EDFSA (ED) and DFSA (D) techniques Chosen EPC Pattern of Tree-based anti-collision methods for Comparative Analysis Chosen Parameters of ALOHA-based anti-collision methods for Comparative Analysis Number of slots and performance analysis for Joined Q-ary Tree (100 tags), Joined Q-ary Tree (50 tags), PCT256 no group, and PCT256 on different number of tags Preferred Anti-Collision Method for Each Location (Zone 1-4) in GTE scenario Selected Anti-Collision Method using Decision Tree and Six Thinking Hats Strategies. Joined Q-ary Tree = JQT; PCT Group = PCT-G; PCT no Group = PCT-NG

23 Abbreviations and Symbols List of Abbreviations A ABS AQS B BA BBT BFSA BS BTS C CDMA CEP CP D DBSA DFSA DoD DSPI E EAS EBBT EDFSA EM EPC ESP F FDMA FV Adaptive Binary Splitting Adaptive Query Splitting Bit Arbitration Bit-by-Bit Binary Tree Basic Framed-Slotted ALOHA Binary Splitting Binary Tree Splitting Code Division Multiple Access Complex Event Processing Company Prefix Dynamic Binary Search Algorithm Dynamic Framed-Slotted ALOHA DoD Identifier Device Service Provider Interface Electronic Article Surveillance Extended Bit-by-Bit Binary Tree Enhanced Dynamic Framed-Slotted ALOHA Electromagnetic Electronic Product Code Extensible Sensor Stream Processing Frequency Division Multiple Access Filter Value

24 G GIAI GID GMN GRAI GTE H H HF HQT I IAR ID-BTS ImpQT IntQT IR IRE ISO IT J JQT L LB LBT LF M MBBT MF N NBL NBQ NCN O OC P PCT POS PT PTES Global Individual Asset Identifier General Identifier Number General Manager Number Global Returnable Asset Identifier Global Trading Enterprise Header High Frequency Hybrid Query Tree Individual Asset Reference ID Binary Tree Stack Improved QT Intelligent Query Tree Item Reference Institute of Radio Engineering International Organisation of Standards Information Technology Joined Query Tree Lower Bound method Listen Before Talk Low Frequency Modified Bit-by-Bit Binary Tree Microwave Frequency Number of Bits per Level Number of Bits per Query Number of Child Node Object Class Probabilistic Cluster-Based Technique Point of Sale Partition Precise Tag Estimation Scheme

25 Q QT QTR R RF RFID RRE RTLS S Sch SDMA SGLN SGTIN SME SN SP SSCC T TDMA TNBL TS U UHF UOC URI USN Query Tree QT-based Reservation Radio Frequency Radio Frequency Identification Redundant Reader Elimination Real-Time Location Systems Schoute method Space Division Multiple Access Serialised Global Location Number Serialised Global Trade Item Number Small and Medium Enterprise Serial Number Separating Point Serialised Shipping Container Code Time Division Multiple Access Total Bits Length/Total Number of Bits Tree Splitting Ultra High Frequency Unique bits within Object Class Uniform Resource Identifier Unique bits within Serial Number List of Symbols Ceiling (Round Up) β Beta κ Kappa max Maximum min Minimum Q Frame-Size µ Mu V Variable

27 1 Introduction Radio Frequency Identification (RFID) technology uses radio frequency waves to automatically identify people or objects. The main RFID systems consist of fast capturing radio frequency tags and networked electromagnetic readers. RFID technology is currently emerging as an important technology for advancing a wide range of applications. It has the potential to improve the efficiency of business processes by providing automatic identification and data capture. The technology that forms the basis for RFID was first developed during World War II where it was used to distinguish between friendly and enemy aircrafts, or also known as Friend-or-Foe (Landt, 2001). The current interest in RFID technology has grown rapidly and can now be certified by CompTIA RFID+ certification in order to validate the knowledge and skills of professionals who work with RFID technology (CompTIA-RFID+, 2008). In the modern world, RFID technology is used in different applications such as distribution and retail packaging, security, library system, defence and military, health care, and baggage and passenger tracing at the airport. RFID system mainly comprises the following components: Tag, which has a microchip attached to an antenna that transmits and responds to radio signals of a particular frequency; Reader, which sends and receives RFID data to and from tags via antennas; Middleware, which preprocesses the RFID data and converts it into a meaningful data; and Application software, which is a specific component that resides on host computer. 1

28 CHAPTER 1. INTRODUCTION RFID reader retrieves information from tags and sends that information back to host computer via middleware. RFID data streams, which is captured by readers, can be accumulated very fast and does not carry much information because it is raw. These data are inaccurate and need to be filtered in order to improve its database management. In the past, RFID systems used proprietary technologies where no worldwide open standards existed (Brown et al., 2007). There was only small inter-connectivity between different RFID vendors products. Every vendor had their own readers, tags, antennas, and equipment but none of the equipment could work together. This lack of inter-connectivity made it challenging for companies to deploy RFID technology in a large global supply chain. However, since 2006, many international and industrial organisations have created open standards, which allow the problem to quickly disappear. There are several methods of identification but the most common is to store a serial number that uniquely identifies a person or object such as Electronic Product Code (EPC). All EPC numbers contain strings of binary numbers, which provide a unique identity for every physical object. All data captured by RFID readers before any further process are known as dirty data. In order to improve efficiency of database, dirty data must be filtered at the earlier stage soon after they were captured. The filtering of RFID data streams is known as filtering at the edge, where data are still meaningless and easier to eliminate. The main issue that usually arises in RFID data streams is the data stream errors. There are four typical errors, which include unreliable reads, noises, missed reads, and duplications/redundancies. Unreliable reads occur when a deployment of RFID tags and readers has an environmental interference such as metal or water nearby (Fishkin et al., 2004). It also occurs due to the orientation or rotation of tags and readers, distance between tags and readers, number of tags and readers in the interrogation zone, and number of objects moving simultaneously. Noises occur when additional unexpected readings are generated. This can be caused by RFID tags outside the normal reading scope accidentally captured by the reader (Bai et al., 2006). Duplications or Redundancies can happen at two different levels: Redundancy at reader level and Redundancy at data level (Derakhshan et al., 2007). Redundancy at reader level occurs when there is more than one reader deployed to cover a specific location, whilst Redundancy at data level occurs when data streams are simultaneously captured more than once by a reader. Several techniques for filtering RFID data have been proposed in literatures. However, these techniques only filter specific kind of errors generated. Therefore, the amount of wrong data is still recorded into the database. The most common errors are missed reads, which usually happen in a situation of low-cost and low power hardware that lead to a frequently dropped reads (Derakhshan et al., 2007). Another cause of missed reads is simultaneous transmissions in RFID systems, which lead to collisions as the readers and tags typically operate on the same channel. Tag collisions in RFID systems happen when 2

29 multiple tags simultaneously reflect their respective signals back to the reader at the same time, preventing the reader from identifying all tags. Filling in dropped reads is one way to alter missed reads but it is sufficient to prevent missing data from the beginning. RFID collision problem can be solved by using anti-collision techniques, to prevent two or more tags from responding to a reader at the same time, and to re-identify them again when collisions occurred. The current deterministic anti-collision methods suffer from identification delay and high memories usage during the identification process, while the probabilistic anti-collision methods suffer from tag starvation problems due to inaccurate frame-size estimation and low performance efficiency. In this research, a Unified Q-ary tree and a Joined Q-ary tree anti-collision schemes are proposed, based on deterministic Q-ary Tree. The motivation of this work is the improvement of data quality obtained, and the minimal use of memories required per complete identification. Methodologies for Unified and Joined Q-ary Tree are first derived and experimental evaluations are then conducted, in order to prove the efficiency of the proposed techniques. The results and analysis of the experiments have indicated that both Unified and Joined Q-ary Tree can effectively reduce total memories required, compared with current state-of-the-art techniques, which then results in the minimal identification delay. Additionally, we propose a new Precise Tag Estimation Scheme (PTES) for Backlog estimation and frame-size prediction compatible with any probabilistic anti-collision technique. The motivation of this work is to achieve a more accurate estimation of number of tags within an interrogation zone, which leads to a more accurate frame-size prediction and system efficiency. The methodology for PTES is first derived, and experiments are then conducted in order to prove the efficiency of the proposed technique. We also introduce a group-selection approach called a Probabilistic Cluster-Based Technique (PCT) method, to improve identification time and minimise number of frames and slots used during an identification process. The experiment results have indicated that PTES, using various parameters, has an impact on probabilistic anti-collision system efficiency. Therefore, to achieve the best performance and solve the tag starvation problem, tags should be grouped into specific size according to PCT threshold and the parameters for frame-size prediction should be dynamically adjusted over the identification process. We also assess our two proposed anti-collision techniques and carry out the comparative analysis in order to find the benefits and disadvantages of each method. We then propose two strategies for selective anti-collision technique management, in order to obtain the optimal outcome of anti-collision method selection. From the investigation, we have discovered that different anti-collision method has advantage over the other in some cases. We found that, by correctly identifying the most suitable anti-collision technique using our proposed Novel Decision Tree Strategy and Six Thinking Hats Strategy, the data collection process can be improved; and the chain reaction toward the next level of data transformation, aggregation, and event processing can be decreased. Thus, it is 3

30 CHAPTER 1. INTRODUCTION important that the correct type of anti-collision algorithm is applicable to different scenarios. In addition, we also demonstrated the applicability of our proposed techniques toward real world scenarios. The remainder of this thesis is organised as follows: In Chapter 2, some background information is provided on RFID including a brief history of RFID, the fundamentals and technology overview, system components, and RFID data management issues. The focus is particularly on the use and issues of RFID technology in supply chain because there has been a great deal of interest in the topic mainly over the past few years. The main components of RFID system including tags, readers, and antennas are also described in depth. In Chapter 3, four typical errors and their causes in RFID data streams are identified. A literature on some current methods and their drawback are also discussed. These include errors handling techniques for unreliable reads, noises, missed reads, and duplications. Particularly, we focus on discussion of specific anti-collision methods for missed reads caused by collisions. The shortcomings of existing methods are identified, and research question is also proposed in this chapter. In Chapter 4, major problems are investigated on existing deterministic anti-collision schemes. Two proposed novel tree-based anti-collision methods, the Unified Q-ary Tree and the Joined Q-ary Tree, are described and analysed. From experimental evaluation, these approaches overcome the limitation of previously proposed approaches, and also significantly preserve memories and improved the identification time. We have also identified and confirmed certain properties of importance for the deterministic anti-collision methods in general. In Chapter 5, main issues are addressed on existing probabilistic anti-collision schemes. We discussed our proposed frame-size estimation method, PTES, and identified strategies to overcome limitations of inaccurate Backlog estimation technique. Furthermore, we described and analysed the PCT approach, to improve identification time and minimise number of frames and slots used during an identification process. Certain properties of importance for the probabilistic anti-collision methods in general, are also clarified in this chapter. In Chapter 6, comparative analysis is conducted between the deterministic Joined Q- ary Tree and the probabilistic PCT. We discovered that both methods have advantages and disadvantages over one another, depending on each specific case. We then introduced two strategies, the Novel Decision Tree and the Six Thinking Hats, in order to find the optimal method for specific scenarios. Additionally, we form a new concept and applicability of each type of anti-collision approach, which we then apply to a sample real-world scenario. Finally in Chapter 7, we conclude the thesis with a summary of the main findings of our study and a discussion of future research plan. 4

31 2 RFID Background In this chapter, we present an overview of some background information including the history of Radio Frequency Identification, the fundamentals and technology overview, system components, and RFID data management issues. 2.1 History of RFID Although many people believe that RFID is a new technology, it has an extensive history. A more precise description of RFID is as an emerging technology, and its emergence is best recognised by evaluating the history of RFID. RFID systems are complex technologies that can be utilised in many ways. Some RFID technologies have existed for a long time and have become more pervasive in the supply chain. Other RFID technologies have been utilised in other industries such as animal tracking, and they have unique advantages (Jones and Chung, 2008). Although the history of RFID can be traced to the 1930s, the technology underlying RFID finds its roots back in 1897, when Guglielmo Marconi invented the radio (Bhuptani and Moradpour, 2005). In order to better define the development of RFID technology, the time-line summaries are shown below (Hunt et al., 2007). Pre 1940s: Beginning in 1896, Marconi, Alexanderson, Baird, Watson, and many others, had tried to apply Electromagnetic Energy laws in radio communications and radar. The work undertaken during this period had become the core of RFID technology. 5

32 CHAPTER 2. RFID BACKGROUND 1940s: After World War II, scientists and engineers continued the research in radio frequency communications and radar. In October 1948, Harry Stockman published a paper in the Proceedings of the Institute of Radio Engineering (IRE) called Communications by Means of Reflected Power. 1950s: During the 1950s, many of the technologies related to RFID were explored by researchers. For example, Harris Patent published a paper called Radio Transmission Systems with Modulatable Passive Responders. The United States military also began to implement an early form of aircraft, using RFID technology called Friend-or-Foe. 1960s: Some commercial activities began in the late 1960s, such as Sensormatic and Checkpoint, which developed Electronic Article Surveillance (EAS) equipment for antitheft and security applications. EAS later became the first widespread commercial use of RFID. 1970s: Government laboratories, academic institutions and companies, became increasingly involved in RFID. In 1975, a paper titled Short-Range Radio-telemetry for Electronic Identification Using Modulated Backscatter, written by Alfred Koelle, Steven Depp, and Robert Freyman was released by Los Alamos Scientific Laboratory. By 1978, a passive microwave tag had been accomplished and in 1979, these tags were used for Animal Tagging. 1980s: The first widespread commercial use of the RFID systems began in the 1980s. The systems were simple, such as livestock management, keyless entry, and personnel access systems. In 1987, the world s first Motor Vehicle Toll Collection was implemented in Norway; and also in Dallas in However, all of the RFID systems implemented in the 1980s were proprietary systems, which kept costs high and slowed down industry growth. 1990s: By 1994, RFID toll systems could operate at highway speeds, which mean that drivers could pass through toll points without the need to slow down. During this time, several standards organisations also worked on publishing guidelines, including the International Organisation of Standards (ISO). In 1999, the Auto-ID Center at Massachusetts Institute of Technology (MIT) was also established for standard organisation purposes. 2000s: By early 2000, 5-cents tags had become a possible picture, and RFID technology could someday replace barcode systems. In 2003, both the world s largest retailer (Wal-Mart) and the world s largest supply chain (the DoD) issued RFID mandates requiring suppliers to begin employing RFID technology by Furthermore, the Auto-ID Center was merged into EPCGlobal in 2006, and all standards were converted to one that serves to increase competition among players in the industry, lower the costs of RFID, and quicken the deployment of RFID technology. 2010s: Many industries continuously adopt and integrate RFID technology toward their organisations. Challenges and opportunities that have arisen in RFID system have been constantly tackled and improved. 6

33 2.2. RFID FUNDAMENTALS 2.2 RFID Fundamentals Radio frequency uses the electromagnetic (EM) waves with different frequencies for communication. Radio frequencies involve a small portion of a larger EM spectrum, where the radio signals are affected in many ways. The total EM spectrum includes other higher frequency waves such as light, ultraviolet, X-ray, gamma-ray, and cosmic-ray. This section explains further elements of RF communication, the radio spectrum, and the interference and Multipath of RF waves Elements of Radio Frequency Communication Radio frequency (RF) communication uses the electromagnetic waves with frequencies from a specific part of the EM frequency spectrum (Sanghera et al., 2007). Therefore, the underlying physics behind RF communication is the same as for any communication that uses electromagnetic waves to carry information. The followings are the four major elements that make this communication happen: Data signal: This is the wave that actually contains the information that needs to be sent to the receiver. Carrier signal: This is the wave that carries the data signal. Modulation: This is the process that encodes the data signal into the carrier signal and creates the radio wave that is actually transmitted by the antenna. Antenna: This is a device used to transmit and receive signals Radio Spectrum Within the radio spectrum is an enormous range of frequencies. To categorise and manage the different areas of the spectrum, the radio spectrum is split into many different segments, but RFID technology uses only four of these segments, as shown in Figure 2.1. Figure 2.1 shows multiple frequencies in relation to the entire radio spectrum. Only frequencies between kHz, 13.56MHz, MHz, and 2.45GHz & 5.8GHz from Low Frequency (LF), High Frequency (HF), Ultra High Frequency (UHF) and Microwave Frequency (MF) are used respectively Interference and Multipath of RF waves Radio signals are affected in many ways by objects in their path and by the media through which they travel (Brown et al., 2007). Interference is the interaction of two or more radio waves resulting in a new wave pattern. When waves generated due to propagation 7

34 CHAPTER 2. RFID BACKGROUND Figure 2.1: RFID Operational Frequencies effects take a different path than the original wave, it is called Multipath. As radio waves travel, they interact with objects and the media they encounter. This interaction causes absorption, diffraction, refraction, reflection, scattering, and free space loss of the wave (Brown et al., 2007; Sanghera et al., 2007). Absorption When an RF wave hits a material object, some of its energy will be absorbed by that object, depending on the frequency of the wave and the material of the object. Water and objects containing water such as liquid products and metal objects, are likely to absorb the RF waves. UHF waves, due to their shorter wavelengths, are more at risk to absorption than LF and HF waves. Diffraction Diffraction refers to the bending of an EM wave when it comes into contact with the sharp edges or when it passes through narrow gaps. Refraction Refraction is the change in direction of a wavefront when it hits the interface of two different media but it does not return to the medium. Instead, it passes from one medium into another. Figure 2.2a) illustrates refraction. Reflection Reflection is the change in direction of a wavefront when it hits the interface of two different media; and returns into the medium which it hits. When a radio signal is 8

$2: An example of: a) Refraction, b) Reflection, and c) Scattering reflected, some loss of signal normally happened, either through absorption or as a result of signal passing into the medium.$

35 2.3. RFID TECHNOLOGY OVERVIEW Figure 2.2: An example of: a) Refraction, b) Reflection, and c) Scattering reflected, some loss of signal normally happened, either through absorption or as a result of signal passing into the medium. Figure 2.2b) illustrates reflection. Scattering Scattering occurs when the medium, which the wave travels through, have smaller dimensions compared to the wavelength. When RF wave is scattered due to rough surfaces of small objects, it results in the loss of the signal or dispersion of the wave, as shown in Figure 2.2c). Free Space Loss If the space through which the RF wave travels is free of all obstructing material, there will be no absorption, refraction, reflection, diffraction or scattering effects. However, there will still be some loss in signal strength called free space loss. 2.3 RFID Technology Overview RFID technology is an automated wireless technology that incorporates the use of the electromagnetic spectrum to uniquely identify people or objects. There are several methods of identification but the most common is to store a serial number that identifies a person or object such as Electronic Product Code (EPC). RFID may only consist of a tag and a reader but a complete RFID system involves many other technologies, for example, computer, network, Internet, and software such as middleware and user applications. The term data streams in this thesis refer to the raw data, which is being communicated and exchanged between RFID readers and tags. The raw data has no meaningful information and needs to be further processed, extracted, integrated or transformed, before being stored into the database. RFID data streams also have common characteristics, which is fundamental for RFID data management. This section explains further on RFID system mechanism, the characteristic of RFID data, the main commercial applications in RFID, and the RFID technology in supply chain. 9

36 CHAPTER 2. RFID BACKGROUND RFID System Mechanism A typical RFID system is divided into two layers: the physical layer or device layer and Information Technology (IT) layer or application layer (Brown et al., 2007; Bornhovd et al., 2005). The physical layer consists of: one or more reader antennas; one or more readers (Interrogator); one or more tags (Transponder); and a deployment environment. The IT layer consists of: one or more host computers connected to readers (directly or through a network); and appropriate software such as device drivers, filters, middleware, databases, and user applications. Nevertheless in some cases, the middleware is classified as its own separate layer, and involved data integration and aggregation. Figure 2.3: An example of how RFID tag, reader, middleware and application operate Figure 2.3 shows how RFID reader retrieves information from tag and sends that information back to host computer via middleware. Middleware first needs to convert raw data retrieved by the reader to a meaningful data, before sending them to an application layer. 10

37 2.3. RFID TECHNOLOGY OVERVIEW Characteristic of RFID data RFID data share common characteristics, which is fundamental for RFID data management. These characteristics are as follows: Streaming and raw data RFID does not carry much information as it is raw. In order to transform this raw data into a meaningful data, several level of inference must be done. Large in volume nature RFID data are generated automatically and accumulated very fast. Some of this data must be filtered and will require a scalable storage scheme to ensure efficient queries and updates. Temporal and dynamic RFID applications dynamically generate observations and the data carry state changes (Wang and Liu, 2005; Wang et al., 2006, 2010; Liu et al., 2006). Thus, it is crucial to model such information in an expressive data that is suitable for application level including tracking and monitoring data. Implicit and inaccuracy of data When the observation occurs in an RFID system, a reader observed EPC, EPC value, and the timestamp. These data carry implicit information, such as changes of states and locations. It is also inaccurate since the real world deployment is often in percent range, which means that 30 percent of data are discarded (Derakhshan et al., 2007; Jeffery et al., 2005, 2006a,b). Thus, raw observations data need to be transformed into business logic data. At this stage, erroneous readings should be handled, such as unreliable reads, duplicate reads, missed reads, and noises Main RFID Commercial Applications There are many applications in modern days that integrate the use of RFID technology in order to improve their business process. Some of the major benefits that the RFID system provides are security and authentication, safety, convenience, and process efficiency (Bhuptani and Moradpour, 2005). The following describes application areas, which are currently used in RFID technology (Polniak, 2007; Finkenzeller, 2003; Ahson and Ilyas, 2008; Chawathe et al., 2004; Collins, 2006; Harrop, 2005; Swedberg, 2005; Ferguson, 2006; Chiesa et al., 2002): Transportation and Distribution: Fixed Asset, Tracking Aircraft, Vehicles, Rail Cars, Containers Equipment, Real-Time Location Systems, and Healthcare Management. Retail and Consumer Packaging: Supply Chain Management, Carton Tracking, Crate/Pallet Tracking, Item Tracking, and Smart shelves. 11

38 CHAPTER 2. RFID BACKGROUND Security and Access Control: Child Tracking, Animal Tracking, Airport and Bus Baggage, Anti-Counterfeiting, Computer Access, Employee Identification, Forgery Prevention, Branded Replication, Parking Lot, Access Room, Laboratory and Facility Access, Toll collection, Library System. Monitoring and Sensing: Pressure, Temperature, Volume, Weight Special, Facility Access, Facility Security Access, and Location within Facility Monitoring. Point of Sale (POS): Automated Payments, Customer Recognition, Smart Card, and RFID Security. Figure 2.4: An example of RFID-enabled Supply Chain System RFID Technology in Supply Chain Over the past few years, there has been a great deal of interest in RFID technology, mainly within the retail industry. Most of the leading supplier claim to provide some level of RFID systems (Derakhshan et al., 2007). Indeed, RFID will help companies leverage real-time information about their stock level and help them improve their replenishment process (Myerson, 2007). For example, the following are some of the benefits from using RFID technology in supply chain management and retailers: Inventory shrinking Retailers replenishment decisions are based on the inventory stock level stored in the supply chain, which is assumed to be accurate. However, the count in the inventory system sometimes does not reflect the correct amount of items in the actual inventory, due to shrinkage or loss of stock. Moreover, handling this type of problem is a very costly operation that requires a regular manual stock take (Lee et al., 2005a). With RFID technology, the cost of a regular stock take can be decreased by maintaining the most accurate amount of stock stored in the data warehouse. 12

39 2.4. RFID CORE COMPONENTS Inventory replenishment Inventory management is a critical task for retail businesses. Most of the time, items on the store s shelves are out of stock while there is a lot of stock available in the business storage. This is because there is no automatic process for detecting items out of stock and restocking the shelf once it becomes empty. However, with RFID technology, shelf inventory can be tracked automatically (Lee et al., 2005a). Visibility of inventory across the supply chain The RFID technology provides a visibility of inventory throughout the entire supply chain. Figure 2.4 shows an example of an RFID-enabled supply chain system. Each product is tagged so that items that move around a business can be monitored from the supplier s warehouse to the store s shelves until the item is checked out at the register. These items tend to move and stay together through different locations especially in an earlier stage of distribution, as seen in Figure 2.4 (Gonzalez et al., 2006b,c,a, 2007). Nevertheless, the advantage of bulky items movement is that all items from the same company, pallet, and case, will have the same pattern of encoding (more explanation in Section ) and will be easier to monitor and manage as data streams. 2.4 RFID Core Components RFID involves detecting and identifying a tagged object through the data it transmits. This requires a tag (transponder), a reader (interrogator) and antennae (coupling devices) located at each end of the system. The reader is usually connected to a host computer or other device which will further process the data captured. One key element of RFID operation is data transfer. This occurs with the connection between a tag and a reader, also known as coupling, through the antenna on either end (Bhuptani and Moradpour, 2005; Karmakar, 2010). In this section, we describe the three common hardware components present in all RFID systems, antenna, reader, and tag RFID Antenna An antenna is a conductive structure that radiates an EM wave when an electrical current is applied to it; and its electronic component is designed to transmit or receive radio waves. It converts electrical energy into a radiating field that extends infinitely outward (Brown et al., 2007; Sweeney, 2005; Karmakar et al., 2008). All RFID systems include two different types of antennas: the reader antenna and the tag antenna. The antenna performance characteristic is one of the most critical elements of any RFID installation because the antenna transfers power and data from the reader to the passive RFID tags (more explanation in Section ) and receives the tags reply. 13

CHAPTER 2. RFID BACKGROUND 2.4.1.1 Antenna Footprint (pattern) The footprints of the reader s antennas determine the interrogation zone (reader zone) of a reader.

40 CHAPTER 2. RFID BACKGROUND Antenna Footprint (pattern) The footprints of the reader s antennas determine the interrogation zone (reader zone) of a reader. In general, an antenna footprint, also called an antenna pattern, is a threedimensional region shaped to look like a balloon projecting out from the front of the antenna. In the real world, the balloon is distorted by interference patterns of radio waves reflected from surrounding objects. Within the reader zone, the antenna s energy is most effective and a reader can read a tag placed inside this region with the least difficulty. Figure 2.5a) shows an example of a simple antenna pattern. Figure 2.5: An example of: a) Simple antenna pattern, and b) Antenna pattern containing protrusions In reality, because of the antenna s characteristics, the footprint of an antenna is never uniformly shaped like a balloon but almost always contains protrusions. Each protrusion is surrounded by dead zones, and such dead zones are also called nulls (Lahiri, 2005). A tag placed in one of the protruded regions will read but not when the tag shifts into the dead zone. Because of the irregular shape of the antenna footprint, an RFID tag may be readable or not readable based on tiny changes in location or orientation. Therefore, it is important to place the tag within the main interrogation zone without depending on the protruded zone. Some of the read range has to be sacrificed but better read rate will be provided as a result. Figure 2.5b) shows an example of an antenna with protrusions Polarisation The readability of the tag greatly depends on the polarisation of the antenna and the angle the tag makes with the reader. For a maximum transfer of power, the reader and the tag antennas should have the same polarisation. For example, if the transmitting antenna is horizontally polarised and the receiving antenna is vertically polarised (or vice versa), not much power can be transferred. If the receiving antenna is circularly polarised, it will receive some radiation regardless of the polarisation of the transmitting antenna. This is because a circular polarisation has both components of the linear polarisation; horizontal and vertical. 14

41 2.4. RFID CORE COMPONENTS Linearly polarised antennas Linear polarisation is relative to the surface of the earth where horizontally polarised waves travel parallel to the ground; and vertically polarised waves travel perpendicular to the ground (Lahiri, 2005). A linearly polarised antenna has a narrower radiation beam with a longer read range compared to a circularly polarised antenna. A narrower radiation beam helps a linearly polarised antenna to read tags within a longer, narrow but well-defined read region, instead of reading tags randomly from its surroundings. A linearly polarised antenna is sensitive to tag orientation with respect to its polarisation s direction. These types of antenna are therefore useful in applications where the tag orientation is fixed and predictable. Figure 2.6 shows how a tag should be oriented with respect to a linear antenna for its proper reading. Figure 2.6: Proper tag orientation for a linearly polarised antenna Circularly polarised antennas A circularly polarised wave basically spins as it travels. If the wave rotates in right-hand/left-hand manner, the antenna is considered to be right-hand/left-hand circularly polarised (Brown et al., 2007; Lahiri, 2005). A circularly polarised antenna has a wider radiation beam and hence reads tags in a wider area compared to a linearly polarised antenna. A circularly polarised antenna is largely unaffected by tag orientation. In a mixed environment where orientation cannot be controlled, circular antennas work best. A circularly polarised antenna is preferred for an RFID system that uses high UHF or microwave frequencies in an operating environment, where there is a high degree of RF reflectance (due to presence of metals and/or waters). Figure 2.7 shows how a tag should be oriented with respect to a circular antenna for its proper reading RFID Reader The reader, also referred to as the interrogator, is a device that captures and processes tag data. Although some readers can also write data onto a tag, the device is still referred to 15

42 CHAPTER 2. RFID BACKGROUND Figure 2.7: Proper tag orientation for a circularly polarised antenna as a reader or interrogator (Bhuptani and Moradpour, 2005; Finkenzeller, 2003; Karmakar, 2010). The reader is also responsible for interfacing with a host computer Reader Types Readers come in multiple formats, which can be separated into three main categories (Sanghera et al., 2007): Fixed readers Fixed readers are fixed-position interrogators mounted at specific locations through which the tagged items are expected to pass, such as conveyors, dock doors, and retail store checkout points. The advantage of a fixed-mount reader is that the tags are read automatically, and the disadvantage is the possibly harsh environment that comes with the location where the reader is mounted. Handheld readers Handheld readers are mobile interrogators. Therefore, they contain all the basic elements in one device, including antenna and application software. The information collected from the tags is stored in the reader and later transferred to a data processing system, if the application requires it. The advantage of handheld reader is that a user can bring the reader close to the tagged item and collect the information. The disadvantage is that the read range is less than that of a fixed reader. Handheld readers can be used for applications such as tracking and scanning items in medical, office, and retail environments because they can be easily moved around. Vehicle-mount readers Vehicle-mount readers are mobile mount interrogators that can be mounted on a vehicle such as a forklifts, paper trucks, cargo trucks, and pallet jacks. The advantage of vehicle-mount reader is that its read range is larger than that of a handheld reader, and can cover more area than fixed-mount reader. The disadvantage is 16

43 2.4. RFID CORE COMPONENTS that it might have to work in the vicinity of metallic materials. This could pose a challenge because metals can reflect the RF signal. In addition, a vehicle-mount reader usually has a special shape for easier installation on a vehicle, and a rugged design to survive the vibrations and other environmental conditions Dense Reader Mode Dense reader mode or dense interrogator mode allows for operation of multiple readers located within close proximity of each other, without causing reader interference. Dense reader mode allows for coordination of readers so that no two readers are transmitting at the exact same moment using exactly the same frequency, which causes interference (Brown et al., 2007). To do this, many readers perform frequency hopping and support a function known as Listen Before Talk (LBT). LBT is often used with frequency hopping, a technology that forces a reader to change channels constantly within each frequency. LBT is where readers use an antenna to listen for the frequency on which the reader is about to transmit. If another reader is communicating in that channel, the reader will automatically switch to the next available channel and transmit there instead. Another way to avoid failed communication between reader and tag is by using anti-collision protocol, especially when passive tags are used in UHF frequency Authentication and Data Encryption/Decryption High-security systems also require the reader to authenticate system users. For instance, Point of Sale (POS) systems, in which money is exchanged and transferred, would be prone to fraud if precaution were not taken. There are two types of RFID authentications; mutual symmetrical and derived keys (Hunt et al., 2007). In both of these systems, an RFID tag provides a key code to the reader, in order to determine if the key is correct and if the tag is authorised to access the system. Data Encryption/Decryption is another security measure that must be taken to prevent external attacks to the system. In the POS example, if user s key is stolen by a criminal, that information can be used to make fraudulent purchases. The reader must implements data encryption and decryption, in order to protect the integrity of data transmitted wirelessly, and to prevent interception by a third party Anti-Collision Most reader operates within UHF band and must have some sort of tag anti-collision algorithms. This is because in UHF band, tags can be captured faster by reader, which may cause collision, and no tag would be identified. The various types of tag anti-collision 17

44 CHAPTER 2. RFID BACKGROUND methods can be reduced to two basic types, deterministic and probabilistic, as shown in Figure 2.8. Tag anti-collision is needed to prevent two or more tags to response to a reader at the same time (Bhatt and Glover, 2006). These anti-collision techniques will be explained in detailed in Chapter 3. Figure 2.8: Various types of anti-collision methods RFID Tag RFID tags come in many different designs, shapes, and sizes. A tag is designed for a particular application depending on the object or material to which the tag is to be attached. The frequency of operation, functionality, and read range of a tag also varies (Bhuptani and Moradpour, 2005; Bhatt and Glover, 2006; Sweeney, 2005) Tag Types Tags may be classified under different categories, depending on how the tags obtain power, the frequency at which they operate, and the various functionalities implemented on the tags. RFID tags are mainly categorised into Chipped and Chipless tags. This thesis focuses on Chipped RFID tags because Chipless tags do not contain a chip or electronic circuit, and thus store information purely in the electromagnetic materials which comprise the tag (Preradovic et al., 2009; Balbin and Karmakar, 2009). Since the absence of an electronic circuit makes it more difficult to store information in a compact area, chipless RFID tags are generally limited to a data capacity of less than 32 bits, although in some cases more bits are possible. Chipped RFID Tag types are separated into three categories known as Passive Tag, Semi-Passive Tag, and Active Tag. Passive tag Passive tag (Figure 2.9a) does not have its own power source, and it has no battery on-board (Lahiri, 2005). The tag obtains power from radio waves received from the reader. Passive Tags are small and light weight, and their functionalities are 18

45 2.4. RFID CORE COMPONENTS Figure 2.9: An example of: a) Passive Tag, b) Semi-passive/Semi-active Tag, and c) Active Tag limited due to power source. Due to a lack of enough power, it cannot support an active transmitter to communicate with the reader. In addition, passive tags do not contribute to radio noise due to lack of transmitter; and they also have longer life of around 20 years compared to semi-passive and active tag. The read range of passive tags is around few inches to 20 feet. In RFID applications, passive RFID tags are often used. Moreover, passive tags are well suited in applications for which tags are not reusable, because of their low cost. The tags become part of the object to which they are attached and have the same life cycle as the object itself. Semi-passive tag Semi-passive tag (Figure 2.9b) is also called semi-active tag (Brown et al., 2007). This tag has an on-board battery but similar to a passive tag, it does not have an active transmitter. It modulates the reflection of the waves from the reader and requires a reader to send data. Semi-passive tags have a longer read range of more than 100 feet compared to passive tags. Since no transmitter is present, semi-passive tags does not contribute to radio noise but they can have more memory compared to passive tag, and can store more data. The extra functionalities of an on-tag battery creates a few problem such as extra weight, larger size, higher cost, shorter life, and temperature sensitivity. An integrated battery means the tag dies when the battery dies. The battery life lasts around 2 to 7 years. Active tag Active tag (Figure 2.9c) has an on-board power source, usually a battery and an active transmitter (Sanghera et al., 2007). It does not need emitted power or radio signals from the reader to transmit its data. Its typical read range is 300 to 750 feet. The read range depends on the battery power and type of transmitter on the tag. An active tag, similar to a semi-passive tag, may have on-board sensors or external sensors connected to it. With more processing power, the tag may collect data from the sensors and locally process the data before broadcasting. Active tags are often used by Real-Time Location Systems (RTLSs). 19

46 CHAPTER 2. RFID BACKGROUND Tag Frequencies RFID tags are categorised according to the frequency at which they are designed to operate. Primary frequency ranges are allocated into four categories for use by RFID systems (previously mentioned in Section 2.2.2). Low Frequency (LF) Tags within the LF range include frequencies from 30 to 300kHz, but only 125kHz to 134kHz are commonly used. A typical LF RFID system operates at 125kHz or 134.2kHz, and this range is available all over the world. The LF tags are passive tags that have no or limited anti-collision capabilities. Therefore, reading multiple tags simultaneously in the interrogator zone is impossible or very difficult. However, the LF tags can be easily read while attached to objects containing water, metal, wood, and liquids because they are not sensitive to radio noise. LF tags are used in access control, asset tracking, animal identification, automotive control, healthcare, and various point-ofsale applications. The automotive industry is the largest user of LF tags where LF tag is embedded inside the ignition key. When that key is inserted into the key hole and tag ID is correct, the car can be started. High Frequency (HF) The HF ranges from 3 to 30MHz, while the only typical frequency being used for HF RFID systems is 13.56MHz. This frequency is now available for RFID applications worldwide. HF tags are passive tags that may have anti-collision capability which allow reading of multiple tags simultaneously in the interrogator zone. However, since the read range of many HF tags and readers is small, they usually do not implement anti-collision. HF tags are ideal choice for applications such as a smart shelf, credit cards, smart cards, library books, airline baggage, and asset tracking. Due to no restrictions on the use of HF frequency, HF tags are currently the most widely used tags around the world. Ultra High Frequency (UHF) The UHF range includes frequencies from 300 to 1000MHz, but only two frequency ranges, 433MHz and MHz, are used for UHF RFID systems. The 433MHz frequency is used for active tags, while the MHz range is used for passive tags or semi-passive tags. All the protocols in the UHF range have some type of anti-collision capability, which allow multiple tags to be read simultaneously within the interrogator zone. However, the UHF tags cannot be easily read while attached to objects containing water or metal because they absorb UHF waves and detune the tag. Microwave Frequency (MF) The MF range includes frequencies from 1 to 10GHz, but only two frequency ranges of around 2.45GHz and 5.8 GHz are used for Microwave RFID systems. Microwave tags are available as passive, semi-passive, and active types. 20

47 2.4. RFID CORE COMPONENTS Japan is the largest user of passive microwave tags. The 2.4GHz frequency range is called Industry, Scientific, and Medical (ISM) band and is accepted worldwide Tag Identification Method There are several methods of identification but the most common is to store a serial number that uniquely identifies a person or object such as Electronic Product Code (EPC). The EPC is designed as a universal identifier that provides a unique identity for every physical object globally. Its structure is defined in the EPCglobal Tag Data Standard (EPCGlobal, 2006, 2005, 2008), which is an open standard, and is freely available. The EPC Class 1 Generation 2 is widely used in the UHF range for communications at MHz. The passive RFID tag used within the UHF range is sometime referred to as EPC Gen-2 tag. An EPC tag contains a 96-bit unique identifier, which is a really big number that will never be repeated or allocated to anything except that tag. The two primary reasons why EPC numbers contain only a unique identifier, as opposed to actual information about the product, are security and cost (Sweeney, 2005). The most common encoding scheme with 96 bits encoding currently used includes: the General Identifier (GID-96), the Serialised Global Trade Item Number (SGTIN-96), the Serialised Shipping Container Code (SSCC-96), the Serialised Global Location Number (SGLN-96), the Global Returnable Asset Identifier (GRAI-96), the Global Individual Asset Identifier (GIAI-96), and the DoD Identifier (DoD-96). In order to manage and monitor the traffic of RFID data effectively, the EPC pattern is usually used to keep the unique identifier on each of the items arranged within a specific range (Darcy et al., 2011). The EPC pattern does not represent a single tag encoding, but rather refers to a set of tag encodings. For instance, the General Identifier (GID- 96) includes three fields in addition to the Header with a total of 96-bits binary value [ ].[ ] is a sample of the EPC pattern in decimal, which later will be encoded to binary and embedded onto tags. Thus, within this sample pattern, the Header is fixed to 25 and the General Manager Number is 1545, while the Object Class can be any number between 3456 and 3478, and the Serial Number can be anything between 778 and 795. Within each EPC, the Uniform Resource Identifier (URI) encoding complements the EPC Tag Encodings, defined for use within RFID tags and other low-level architectural components. URIs provide an information for application software to influence EPC in a way that is independent of any specific tag-level representation. The URI forms are also provided for pure identities, which contain just the EPC fields which are used to distinguish one item from another. For instance, for the EPC GID-96, the pure identity URI representation is as follows: urn:epc:id:gid:generalmanagernumber.objectclass.serialnumber 21

48 CHAPTER 2. RFID BACKGROUND In this representation, the three fields GeneralManagerNumber, ObjectClass, and SerialNumber correspond to the three components of an EPC General Identifier (EPCGlobal, 2008). There are also pure identity URI forms defined for identity types that correspond to certain encodings. The URI representations corresponding to these identifiers are as shown in Table 2.1. Table 2.1: The Uniform Resource Identifier (URI) encoding complements the EPC Tag Encodings defined for use within RFID tags and other low-level architectural components Encoding Scheme Uniform Resource Identifier GID urn:epc:id:gid:generalmanagernumber.objectclass.serialnumber SGTIN urn:epc:id:sgtin:companyprefix.itemreference.serialnumber SSCC urn:epc:id:sscc:companyprefix.serialreference SGLN urn:epc:id:sgln:companyprefix.locationreference.extensioncomponent GRAI urn:epc:id:grai:companyprefix.assettype.serialnumber GIAI urn:epc:id:giai:companyprefix.individualassetreference DoD urn:epc:id:usdod:cagecodeordodaac.serialnumber An example encoding of GRAI is demonstrates as follows: urn:epc:id:grai: From the above example, the corresponding GRAI is Referring to Table 2.1, the CompanyPrefix, AssetType, and SerialNumber of GIAI are represented as , 12345, and 1234 respectively. Some of the major encoding schemes, which are used and incorporated within our methodology, will be further explain within the thesis. 2.5 RFID Data Management Issues RFID data management is one of many issues surrounding the deployment of RFID. Palmer (2004) stated that data should be absorbed closer to the source, which caters to the need for pre-processed data, so that only relevant and meaningful information is passed to the application software. This is where the raw data had been collected and should be filtered before being passed into the applications. Filtering must be done in this capture layer, and the data captured at this layer is considered to be Dirty data. This theory is supported in (Derakhshan et al., 2007) where it is described that, in RFID data management, Dirty data appears in four general forms: unreliable reads, missed reads, noise, and duplication. After the data has been filtered, it is then turned into more meaningful data and stored into the databases. RFID data management is classified into three stages: 1) Data Capturing Process where Dirty data are being captured by RFID devices; 2) Data Processing and Event Management where Dirty data are being computed, transformed, aggregated, and integrated into meaningful data; and 3) Data Warehousing and Data Mining where data are stored into the database for a later use (Melski et al., 2007). 22

49 2.6. SUMMARY Data Capturing Process Data capturing process is considered as the most important stage in RFID system. This is because any data captured during this stage will be used for further process by the other two stages. The data capture layer is responsible for coordinating and detecting multipletagged objects and filtering incoming data before sending to the next layer (Derakhshan et al., 2007). Therefore, any errors that occur in data capturing level will be carried on toward the rest of the procedure. Such data stream errors are: unreliable reads, missed reads, noise, and duplication Data Processing and Event Management Simple and complex event detection is one of the primary roles played by data processing and event management stage. Simple events in RFID applications are those which are generated during the interactions between readers and tagged objects (Derakhshan et al., 2007). In order to detect more complex events, we need to filter and correlate massive number of simple events. To model the process of complex events, we need to define a language that would we able to filter and correlate the events. Complex event languages have been discussed in different contexts in order to transform, aggregate, and integrate Dirty data into meaningful data (Abadi et al., 2004; Wu et al., 2006; Gyllstrom et al., 2007) Data Warehousing and Data Mining The final stage in RFID system management is the data warehousing and data mining. Data warehousing is defined as a process of centralised data management and retrieval. Data warehousing represents an ideal vision of maintaining a central warehouse of all organisational data. All data from the data warehouse must be cleaned before they can be effectively used by the business application. Data mining, on the other hand, is the process of analysing data from data warehouse and summarising it into useful information. This step of data management is also important as some errors from data capturing and event processing may remains (Darcy et al., 2009, 2010a,b). 2.6 Summary In this chapter, an overview of some background information such as history of RFID, RFID basics and technology overview, were presented. Particular attention was given to the component of RFID systems, which are Antenna, Reader, and Tag. Current applications that are currently in use, such as RFID technology in supply chain, were also discussed. 23

50 CHAPTER 2. RFID BACKGROUND Some data management prospective was also identified in this chapter. RFID data are raw, implicit and inaccurate, and accumulates very fast. In order to manage these data efficiently and effectively, these data must be cleaned and filtered before storing them into the database. Therefore, in this research, the focus will be on data capturing level where RFID data is being collected before any further process. The specific issues and the current methods for filtering RFID data streams, particularly Anti-Collision techniques, will be discussed in the next chapter. 24

51 3 RFID Data Streams Management Techniques In this chapter, we examine different types of data stream filtering techniques, particularly the anti-collision methods. We survey existing filtering approaches for different types of RFID data stream errors, and analyse missed reads caused by data collision, which is the most crucial type of error. We also discuss the anti-collision techniques, and conclude this chapter with discussion on the research problems and limitation of existing methods. 3.1 Filtering of RFID Data Streams Due to the low-power and low-cost constraints of RFID passive tags, the reliability of RFID data capture process has become more challenging in many circumstances (Brusey et al., 2003). There are several data filtering processes that handled different types of data stream errors. This section describes and discusses four major data stream errors including Unreliable reads, Missed reads, Noise, and Duplication Unreliable Reads In an RFID deployment, environmental interference such as metal or water can sometimes cause unreliable reads. Moving tags (objects) is also a common problem that causes unreliable reads. For instance, baggage with RFID tags in airports can move too fast on conveyor belts and is not properly detected by the reader. 25

52 CHAPTER 3. RFID DATA STREAMS MANAGEMENT TECHNIQUES Tag deployment affects the RFID data capturing process in many ways and also caused unreliable reads, as described below: Tag Orientation and Location: Tag performance is affected by the orientation of the tag, relative to the reader s antenna. The best tag orientation occurs when the tag orientation and the antenna orientation are parallel to each other. As the tag is rotated away from parallel position, it collects less power and the tag read range decreases as the collected power decreases. The location of the tag within the interrogator zone also affects the tag s performance (Sweeney, 2005). Tag Placement: Placement of the tag on an object affects the tag s performance especially for passive tag used in supply chain warehouse. This is because radio wave will be mostly absorbed if located near liquids and metals. The best place to attach a passive tag on cases (at case-level tagging) is where the item packaging provides the most separation from the liquids inside. Various test methods must be used in order to determine the best tag location for specific product types (Brown et al., 2007). Tag Stacking/shadowing: Tag stacking occurs when several tags are placed close to each other. For example, item B is placed behind item A. From the perspective of the reader, tag A that is closer to the antenna, absorbs and reflects most of the radio energy. Therefore, tag B receives a very weak signal and could be missed by the reader. To avoid this problem, packaging method should be designed carefully and the case should be rotated in front of several antennas, installed at various angles to the case. Tag stacking is also referred to as tag shadowing (Sanghera et al., 2007). According to Fishkin et al. (2004) and D Mello et al. (2008), the experiment testings have shown that some deployment of tags and readers can result in the way reader emits a low-power radio signal through its antenna to the passive tag. These studies can help solve some unreliable reads, which is caused by different environment. Unreliable reads can be solved at the hardware level, where relative parameters can be set according to specific guidelines. The followings are parameters that affect the way tags received power from readers: Flooring: Metal floor results in reader failure to detect tags. Distance between tag and reader: If tags are out of reader s range (or in a weak range), the percentage of tags detected by reader are very low (or 0). Number of tags on an object and their placement on that object: A certain number of tags can increase accuracy to the reading but too many tags in reading scope can result in tag collisions. 26

53 3.1. FILTERING OF RFID DATA STREAMS Number of readers and their deployment topology: Too many readers covering the same area can result in a lot of duplication. Number of nearby tags: Nearby tags may be detected by a reader outside their range. This resulted in noisy readings. Number of objects moved simultaneously: The experiment shows that by using handheld reader walking passed tags, most tags are detected by the handheld reader especially those tags facing directly to the handheld reader. Tag orientation and rotation: A rotation of 30 and 60 degrees results in all tags being detected by an RFID reader. However, a 90 degrees rotation of a reader results in no tags being captured Noises Noises refer to the additional unexpected readings generated. This can be caused by an RFID tag outside the normal reading scope of a reader being captured for unknown reason (Bai et al., 2006). Table 3.1 shows that since three tags i.e. TagD, TagF, and TagG, are below the specific threshold, these tags are classified as noise readings. In addition, TagB (SGTIN encoding) and TagH (GRAI encoding) are also classified as noise as these tags have different encoding schemes. Table 3.1: A sample of noise where * indicates a noise reading. Since the noise threshold equals to 3 and the tag catch is only for GID encoding, any tag that appears less than three times within a specific time frame or does not satisfy tag catch requirement, is classified as noise Tag EPC(CATCH:gid: ) Count threshold = 3 TagA urn:epc:id:gid: Count1 TagA urn:epc:id:gid: Count2 TagA urn:epc:id:gid: Count3 TagB urn:epc:id:sgtin: * Count1 TagA urn:epc:id:gid: Count4 TagC urn:epc:id:gid: Count1 TagC urn:epc:id:gid: Count2 TagD urn:epc:id:gid: Count1 TagD urn:epc:id:gid: Count2* TagC urn:epc:id:gid: Count3 TagE urn:epc:id:gid: Count1 TagE urn:epc:id:gid: Count2 TagE urn:epc:id:gid: Count3 TagF urn:epc:id:gid: Count1* TagG urn:epc:id:gid: Count1 TagG urn:epc:id:gid: Count2* TagH urn:epc:id:grai: * Count1 From literature survey, Bai et al. (2006) proposed several algorithms including both noise removal and duplication elimination. Three sliding window based algorithms for 27

54 CHAPTER 3. RFID DATA STREAMS MANAGEMENT TECHNIQUES denoising were proposed in the paper. According to the authors, a sliding window is a window with certain size that moves with time. RFID reading tags will enter the window and will expired at certain time. The noise readings are readings with count of distinct tag EPC values below a threshold. The following briefly summarised the three algorithms: First algorithm is Baseline denoising. For each incoming reading of value R, a full scan of window size is performed. If R is higher than a threshold, it is classified as a non-noise reading. The readings of the same key are then output if threshold condition is satisfied. To ensure a particular reading is never output more than once, a state-of-output with each reading in the window buffer, is kept and set to true when it is output once. Second algorithm, Lazy denoising, is an improved version of the first algorithm. The output from Baseline denoising algorithm can be out of order. This affects all further RFID data processing where correct ordering of observations is critical, such as complex RFID event detections for real-time RFID applications, and RFID data aggregation. To solve this out-of-order problem, a Lazy denoising algorithm is proposed using hash table, along with output order preserving. The output from this algorithm will be delayed until they expire from sliding window, to ensure that everything will be output in order. Third algorithm, Eager denoising, is an improved version from Lazy denoising algorithm and is also implemented with output order preserving. In the second algorithm, the output was delayed until the window expires. This could be a problem when the width of window is quite long. This third algorithm can output data earlier while preserving correct time order. This is because the issue of wrong order occurs only when a reading has been output, before the change of labeling on some earlier reading within the window. Therefore, for a non-noise reading that is known as no other earlier noise reading presented in the sliding window, it can be safely output without the risk of order problems Duplications As stated by Derakhshan et al. (2007), the Duplication problem is recognised as a serious issue in RFID and sensor networks because it often cause the reduction in system robustness. Duplication can happen at two different levels: duplication at reader level and duplication at data level: Duplication at reader level Duplication at reader level occurs when there is more than one reader deployed to cover a specific location. Figure 3.1 shows that readers R1 and R2 cover the same shaded area 28

55 3.1. FILTERING OF RFID DATA STREAMS on the left (S1 ); and readers R2 and R3 cover the same shaded area on the right (S2 ). Figure 3.1: An example of three readers deployment, where R1 and R2 covered S1, and R2 and R3 covered S2 Carbunar et al. (2005) proposed an algorithm called Redundant Reader Elimination (RRE), which is a randomised, decentralised, and localised approximation algorithm for the RRE problem. This has been done to eliminate duplication at the reader level. The RRE algorithm has three different steps. Firstly, the algorithm detects the set of RFID tags placed around the covered area of reader. Secondly, each RFID reader attempts to write a number of covered tags count onto all of its covered tags. The reader that issued the highest count for a tag will mark the tag. Finally, each reader queries all its covered tags and determines the one it has marked. covered tags is declared as a duplicate reader. A reader that has not marked any of its Duplication at data level Duplication at data level occurs as data streams. The RFID data can be captured very fast and usually less meaningful without transformation. Some of these data had been captured more than once, so it is possible to identify and eliminate them before passing it to the application. Table 3.2 shows that TagE is captured twice and TagF is captured three times. Since these duplicated reads are close to each other, it is assumed that this may happen because the tag remained in the scope of a reader for a long time and is read by the same reader multiple times. Two algorithms for duplication removal at data level called Baseline merge and Hash merge were proposed by Bai et al. (2006). Baseline merge eliminates duplication by maintaining a timestamp to indicate the last time a reading, with the same key as the incoming reading, appears. Hash merge uses hash table to keep the last appearance timestamp for each distinct key value. For each incoming reading, its timestamp is compared with the corresponding entry for this key in the hash table. The reading is determined to be a new tag reading if the key does not appear in the table, or the time distance is larger than threshold. Furthermore, Pupunwiwat and Stantic (2007), also proposed a Location Filtering and Duplication Elimination technique to eliminate any data duplication, and at the same time located where the data has been captured. 29

56 CHAPTER 3. RFID DATA STREAMS MANAGEMENT TECHNIQUES Table 3.2: A sample of data duplication, where TagE is captured twice and TagF is captured three times Tag EPC(CATCH:gid: ) Count = 1 TagA urn:epc:id:gid: Count1 tagb urn:epc:id:gid: Count1 TagC urn:epc:id:gid: Count1 TagD urn:epc:id:gid: Count1 TagE urn:epc:id:gid: Count1 TagE urn:epc:id:gid: Count2* TagF urn:epc:id:gid: Count1 TagF urn:epc:id:gid: Count2 TagF urn:epc:id:gid: Count3* TagG urn:epc:id:gid: Count1 TagH urn:epc:id:gid: Count1 TagI urn:epc:id:gid: Count Missed Reads Missed Reads are very common in RFID applications and often happened in a situation of low-cost and low power hardware, which leads to a frequently dropped readings referred to in other work (Derakhshan et al., 2007). Another source of missed reads is when multiple tags are detected by a reader and Radio Frequency (RF) collisions occur, causing RF signals to interfere with each other and preventing the reader from identifying any tags (Bai et al., 2006). Dropped reading can be easily filtered using Smoothing techniques, where missing data from specific time can be filled (Jeffery et al., 2005, 2006a,b). However, preventing data resulting from RF collisions can be hard. In order to solve this problem, anti-collision is usually performed at the edge, to prevent two or more tags from responding to a reader at the same time. Tag anti-collision protocols or sometimes referred to as Multi-Access, can be classified into probabilistic and deterministic methods, which will be explained further in Section 3.2. Table 3.3 shows an example of missed read where at time 500msec, 800msec and 1000msec, a reading of TagA is missing. Table 3.3: A sample of Missed reads where at time 500msec, 800msec and 1000msec, readings of TagA are missing Time (msec) Tag EPC(CATCH:gid: ) 100 TagA urn:epc:id:gid: TagA urn:epc:id:gid: TagA urn:epc:id:gid: TagA urn:epc:id:gid: TagA urn:epc:id:gid: TagA urn:epc:id:gid: TagA urn:epc:id:gid: As mentioned earlier, Smoothing technique can be used to handle missed reads. Smoothing technique is part of a proposed technique called Extensible Sensor stream 30

57 3.2. COLLISION HANDLING IN RFID DATA STREAMS Processing (ESP), which focuses on cleaning data at the edge of the network. It allows raw data to be cleaned by processing multiple data streams, and exploiting the temporal aspect of data to produce a single improved output stream that can be used directly by applications. In addition, a duplication elimination method called Arbitrate stage of ESP is proposed by the authors. According to the surveys (Jeffery et al., 2005, 2006a,b), ESP has five different stages for data filtering: The Point stage filter individual values e.g. RFID tags or obvious Missed reads; At Smoothing stage, ESP interpolates for lost reading within a temporal granule. ESP runs this query over each reader data stream. The query begins by breaking the streams into smaller slices that correspond with the size of granule. Through these slices window operation, smooth filled in Dropped readings for any tags, that are seen at least once in slice time period; The Merge stage corrects the Missed reads and removes outliers spatially; The Arbitrate stage deals with conflicts, such as duplication; The Virtualise stage combines readings from different types of devices together. Floerkemeier and Lampe (2004) and Floerkemeier and Lampe (2005) have identified a solution to reduce a tag collision that causes a missed reads, using a playing card scenario. Based on a Framed-ALOHA technique, the authors used a different layout of playing card to determine the tag captured by reader within a set time period. The result indicated that stacked card has the worst tag recognition rate since it has the most collision when stacked together. To reduce the missed reads that arise from the tag collisions, the bandwidth of RFID technology should be increased. Since the 900 MHz in UHF band offers significantly more bandwidth in the communication from the reader to the tag than the regulations on the MHz in HF band, an RFID system operating in the UHF band can detect tags much faster. Nevertheless, tag collision issues remain a challenging topic in current research. 3.2 Collision Handling in RFID Data Streams RFID collision handling is one of the most heavily researched topics because it is a very important step to determine a quality of captured data. The better quality of data at the earlier stage of data processing means less complex algorithms are needed for RFID event process and database management. This section explains the type of each collision, classification of Multi-Access, taxonomy of RFID tag anti-collision protocols, and literature surveys on existing deterministic and probabilistic anti-collision methods. 31

CHAPTER 3. RFID DATA STREAMS MANAGEMENT TECHNIQUES 3.2.

58 CHAPTER 3. RFID DATA STREAMS MANAGEMENT TECHNIQUES RFID Collision Types Simultaneous transmissions in RFID systems lead to collisions as the readers and tags typically operate on the same channel. Three types of collisions are possible: Reader- Reader collision, Reader-Tag collision, and Tag-Tag collision. Figure 3.2: Collision Problems in RFID System: a) Reader-Reader Collision, b) Reader- Tag Collision, and c) Tag-Tag Collision Reader Collisions There are two types of Reader collision: 1) Reader-to-Reader, and 2) Reader-to-Tag (Leong et al., 2005; Jain and Das, 2006). Reader-to-Reader Collisions Interference occurs when one reader transmits a signal that interferes with the operation of another reader, and prevents the second reader from communicating with tags in its interrogation zone (Jain and Das, 2006). Reader-to-reader collision can be easily avoided by determining the appropriate reader s deployment that prevents direct signal interference between two or more readers. Figure 3.2a) shows an example of Reader-to-Reader collision. Reader-to-Tag Collisions Interference occurs when one tag is simultaneously located in the interrogation zone of two or more readers, where more than one reader attempts to communicate with that tag at the same time (Jain and Das, 2006). Figure 3.2b) shows an example of Reader-to-Tag collision. The classification and solution of Reader collisions problems are illustrated in Figure

3.2. COLLISION HANDLING IN RFID DATA STREAMS Figure 3.3: Taxonomy of RFID Readers anti-collision protocols 3.2.1.

59 3.2. COLLISION HANDLING IN RFID DATA STREAMS Figure 3.3: Taxonomy of RFID Readers anti-collision protocols Tag Collisions Tag collision in RFID systems, sometimes known as Multi-Access, happens when multiple tags are energised by the RFID reader simultaneously, and reflect their respective signals back to the reader at the same time. This problem is often seen whenever a large volume of tags must be read together in the same reader zone. The reader is unable to differentiate these signals. Figure 3.2c) shows an example of Tag-to-Tag collision Division Classification for Multi-Access Tag collisions or Multi-Access problem is more complex than those within reader collision categories. There are several techniques in the literature (Shih et al., 2006; Tang and He, 2007; Liu, 2010) that explain and differentiate between each type of Multi-Access Divisions. These divisions are summarised below: SDMA (Space Division Multiple Access): The term SDMA relates to techniques that reuse a certain resource, such as channel capacity in spatially separated areas (Shih et al., 2006; Tang and He, 2007). One method is to reduce the range of a single reader, but to remain the capability of certain coverage area by grouping together a large number of readers to form an array. As a result, the channel capacity of attaching readers is made available simultaneously. A disadvantage of the SDMA technique is high implementation cost of the complicated reader s deployment. The use of this type of anti-collision procedure is therefore restricted to a few specialised applications. FDMA (Frequency Division Multiple Access): The term FDMA relates to techniques in which several transmission channels on various carrier frequencies are simultaneously available to the communication members (Shih et al., 2006; Tang and He, 2007). In RFID systems, this can be achieved using tags with a freely adjustable transmission frequency. One disadvantage of the FDMA procedure is the high cost of the readers, 33

60 CHAPTER 3. RFID DATA STREAMS MANAGEMENT TECHNIQUES since a specific receiver must be provided for every reception channel. This anti-collision procedure also remains limited to a few specialised applications. CDMA (Code Division Multiple Access): There are actually a number of different subtypes to CDMA depending on how the spreading is done. The common factor is that CDMA uses spread spectrum modulation techniques based on pseudo random codes, to spread the data over the entire spectrum. Even though CDMA would be ideal in many ways, the disadvantage is that it adds a high number of complexity and would be too computationally intense for RFID tags (Shih et al., 2006; Tang and He, 2007; Dabas et al., 2009). TDMA (Time Division Multiple Access): The term TDMA relates to techniques in which the entire available channel capacity is divided chronologically between the participants (Shih et al., 2006; Tang and He, 2007). In RFID systems, TDMA procedures are the largest group of anti-collision procedures. Tag-driven and reader-driven procedures have been differentiated as follows: Tag-driven: Tag-driven procedures function asynchronously because the reader does not control the data transfer. For example, in the Pure ALOHA procedure, a tag begins transmitting as soon as it is ready and has data to send. Tag-driven procedures are naturally very slow and inflexible. Reader-driven: Most applications use procedures that are controlled by the reader as the master. These procedures can be considered as synchronous, since all tags are controlled and checked by the reader simultaneously. An individual tag is first selected from a large group of tags in the interrogation zone; and then communication takes place between the selected tag and the reader. Examples of algorithms using reader-driven procedure are the deterministic algorithm, such as Query Tree; and the probabilistic algorithm, such as Framed-Slotted ALOHA Taxonomy of RFID Tag Anti-Collision Protocols The various types of anti-collision methods for multi-access/tag collision can be reduced to two basic types: probabilistic method and deterministic method (Klair et al., 2007; Choi and Lee, 2007; Bang et al., 2009; Alotaibi et al., 2009; Li et al., 2009; Klair et al., 2010; Zhu and Yum, 2011). In probabilistic methods, tags respond at randomly generated times. If a collision occurs, colliding tags will have to identify themselves again after waiting for a random period of time. This technique is faster than deterministic but suffers from tag starvation problem where not all tags can be identified due to the random nature of chosen time. 34

3.3. DETERMINISTIC ANTI-COLLISION PROTOCOLS The deterministic method begins an identification process by issuing a prefix until it gets matching tags.

61 3.3. DETERMINISTIC ANTI-COLLISION PROTOCOLS The deterministic method begins an identification process by issuing a prefix until it gets matching tags. Then it continues to ask for additional prefixes until all tags within the region are found. This method is slow but leads to fewer collisions and have high successful identification rate. There are also some hybrid anti-collision protocols that combine the advantages of tree-based and ALOHA-based approaches. From literature, it is clear that most hybrid protocols combine the Query Tree protocol with ALOHA variant (Klair et al., 2010). Figure 3.4 shows the classification of tag anti-collision protocols implemented in TDMA, including probabilistic and deterministic methods. Deterministic method is also divided into Memory and Memoryless protocols; where Memory protocols need additional memory other than tags ID, while Memoryless only need an ID of tag for the whole identification process. The most advanced and efficient probabilistic anti-collision is the Framed-Slotted ALOHA approach, while the memoryless Query Tree is the simplest and most robust technique. The literature survey on both deterministic and probabilistic anti-collisions are explained in detail in the next two sections. Figure 3.4: Taxonomy of RFID Tags anti-collision protocols 3.3 Deterministic Anti-Collision Protocols Deterministic methods can be classified into a Memory tree-based algorithm and a Memoryless tree-based algorithm. In the Memory algorithm, which can be grouped into Tree Splitting, Binary Search, and Bit Arbitration, the reader s inquiries and the responses of the tags are stored and managed in the tag memory. This results in an equipment cost increase especially for RFID tags. In contrast, in the Memoryless algorithm, the responses of the tags are not determined by the reader s previous inquiries. The tags responses are determined only by the present reader s inquiries so that the cost for the tags can be minimised. Query Tree is classified as Memoryless algorithm. 35

62 CHAPTER 3. RFID DATA STREAMS MANAGEMENT TECHNIQUES Depending on the number of tags that respond to the interrogator, there are three cycles of communication between tag and reader in deterministic approaches. Collision cycle: Collision cycle occurs when the number of tags that respond to the reader is more than one. The reader cannot identify the ID of tags. Idle cycle: Idle cycle occurs when there is no response from any tag to the reader. This type of cycle is unnecessary and should be minimised. Successful cycle: Successful cycle happens when exactly one tag responds to the reader and the reader can identify the ID of that tag. The rest of this section discusses each type of tree-based anti-collision algorithm and their benefits and drawbacks Binary Search Binary Search (BS) algorithm (Finkenzeller, 2003) involves the reader transmitting a string of EPC to tags, which the tag then compare against its ID. Those tags respond, with ID equal to or lower than the requested string. The reader then monitors tags responses bit by bit using Manchester coding, where the value of bit is defined by the change in level (negative or positive transition) within a bit window. A logic 0 is coded by a positive transition, while a logic 1 is coded by a negative transition. The no transition state is not permissible during data transmission and is recognised as an error. Once a collision occurs in BS, the reader splits tags into subsets, based on collided bits. The enhanced version of the BS protocol is called the Dynamic Binary Search Algorithm (DBSA) (Finkenzeller, 2003). In DBSA, the reader and tags do not use the entire length of EPC and tags ID during the identification process. For example, if a reader receives the response 01X, tags only need to transmit the remaining part of their ID since the reader has already identified the prefix 01. This enhancement effectively reduces the amount of data sent by the reader to tags. Additionally, there are several improved and enhanced BS algorithms (Yu et al., 2005; Liu et al., 2005; Chen and Liao, 2010), which introduce higher complexity in implementation. Nevertheless, all types of BS algorithms require extra memories beside the EPC data itself, especially the most enhanced version, which requires additional memory to store information on current stage of reading process Bit Arbitration Bit Arbitration (BA) algorithms are memory-based anti-collision and are less robust than those within memoryless category. The key feature of BA algorithms is that bit 36

63 3.3. DETERMINISTIC ANTI-COLLISION PROTOCOLS replies are synchronised, meaning that multiple tags responses of the same bit value will result in no collision. A collision is observed only if two tags respond with different bit values. Moreover, the reader has to specify the bit position it wants to read. There are several algorithms in this category including ID Binary Tree Stack, Bit-by-bit Binary Tree, Modified Bit-by-Bit Binary Tree, and Enhanced Bit-by-Bit Binary Tree (Klair et al., 2010). In the ID Binary Tree Stack (ID-BTS) (Feng et al., 2006), the reader uses a stack to store tags position on the tree, while a tag has a counter to record the depth of the reader s stack. Based on this counter value, a tag determines whether it is in the transmit or wait state. In other words, a counter value of zero moves a tag into the transmit state; otherwise, the tag enters the wait state. Once a tag is identified, it enters the sleep state. This technique needs a highly preserved memory in the reader, due to the heavy use of stack. Jacomet et al. (1999) presented a Bit-by-Bit Binary Tree (BBT) arbitration method, where a separate channel is used for binary 0 and 1. When requested, each tag transmits the specified bit in one of these channels. If the reader receives a different response from both channels, it sends a control bit silencing the subset of tags that replied with 0 or 1. On the other hand, if the reader receives a response of bit in only one of the two channels, that bit is then successfully identified. Similar to ID-BTS, the reader has a stack and each tag has a counter to store its tree position. Choi et al. (2004) proposed the Modified Bit-by-Bit Binary Tree (MBBT), which operates in a similar manner to the BBT algorithm. The key difference is that MBBT does not use multiple time-slots to receive binary 0s and 1s. Moreover, Choi et al. (2004) also proposed an Enhanced Bit-by-Bit Binary Tree (EBBT). In EBBT, a reader first requests tags to respond with their complete ID. The assumption here is that tags responses are synchronised. From these responses, the reader identifies collided and non-collided ID bits. The reader then uses MBBT to identify the collided bits Tree Splitting Tree Splitting (TS) protocols operate by splitting responding tags into multiple subsets, using a random number generator. In this category of anti-collision, the reader needs less preserving memory than those within Binary Search and Bit Arbitration categories because TS only needs to store information of random binary numbers. We present two algorithms in this category: Binary Tree Splitting and Adaptive Binary Splitting Binary Tree Splitting The Binary Tree Splitting (BTS) uses random binary numbers generated for the splitting procedure (Myung and Lee, 2006a). The tag has a counter initialised to 0 at the beginning 37

64 CHAPTER 3. RFID DATA STREAMS MANAGEMENT TECHNIQUES of the process. The tag transmits ID when the counter value is 0. The reader transmits a response to inform tags of the event of tag collision. The tag randomly generates a binary number when its transmission causes collision. By adding the selected binary number to the counter, a set is split into two subsets, 0 or 1 as shown in Figure 3.5. Figure 3.5: Binary Tree Memory based anti-collision protocol Adaptive Binary Splitting Tag identification in BTS protocols starts from one tag set including all tags, which cause more tag collisions for splitting tag sets. The colliding tag needs to re-transmit ID whenever tag collision occurs. The Adaptive Binary Splitting (ABS) uses information on the last frame of the tree and makes a new tag identification start from multiple tags sets. Hence, the reader can recognise tags with less collision. ABS begins tag identification from only readable cycles of the last frame and uses random numbers for splitting tag sets. This technique is an improvement on the BTS protocol. However, it requires extra memory to store information from previous frame. ABS requires tags to support both the transmission and reception at the same time Query Tree In TS variants, tags require a random number generator and a counter to track their tree position, thus making them costly and computationally complex. Query Tree (QT) algorithms overcome these problems by storing tree construction information at the reader, and tags only need to have a prefix matching circuit. Numerous variants of query tree algorithms exist. Among all tree protocols, QT protocols promise the simplest tag design (Klair et al., 2010). For the tree-based anti-collision, we focus on QT-based protocols because it is the most acceptable and is an effective anti-collision technique for passive UHF tags (Klair et al., 2010). There are several improved anti-collision methods based on QT, such as an Adaptive Query Splitting (AQS) proposed by Myung and Lee (2006b), and a Hybrid 38

65 3.3. DETERMINISTIC ANTI-COLLISION PROTOCOLS Query Tree (HQT) proposed by Ryu et al. (2007). The AQS requires tags to support both the transmission and reception at the same time, thereby making it difficult to apply to low-cost passive RFID systems. On the other hand, the HQT managed to reduce collision cycles but at the same time introduce too many idle cycles. Accordingly, the QT Algorithm, which is currently adopted as the anti-collision protocol in EPC Class 1, may be limited to the tree based anti-collision protocol that can be implemented effectively (Choi et al., 2008). The QT (Law et al., 2000) is a data structure for representing prefixes that is sent by the RFID reader. The QT algorithm consists of loops, and in each loop, the reader issues a query with specific prefixes, and the matching tags respond with their information. If only one tag replies, the reader successfully recognises the tag. If more than one tag tries to respond to reader s query, tag collision occurs and the reader cannot get any information about the tags. The reader, however, can recognise the existence of tags to have ID that matches the query. To further identify collided tags, the QT algorithm tries to query with 1-bit longer prefixes in next round of identification. By extending the prefixes, the reader can recognise all the tags. Figure 3.6: Query Tree Memoryless based anti-collision protocol Figure 3.6 displays an example of a QT procedure. An identification process starts at Level one of tree, where QT uses tag IDs to split a tag set. It can be seen that Tag 1010 is successfully identified in the first round because from all three tags, only Tag 1010 has 1 for the first bit of string. In the second round of identification, idle cycle was created, as there was no tag starting with 00 for the first two bits. In the third round of identification, the other two tags, Tag 0100 and Tag 0111, are successfully identified Adaptive Query Splitting The Adaptive Query Splitting (AQS) uses information on the last frame of the tree for tag identification, so that the reader can recognise tags with less collision. The AQS recognises tags with query that is sent by a reader, which includes a bit string. The basic idea of AQS is based on QT where tag responds with its ID when its first bits of ID are 39

CHAPTER 3. RFID DATA STREAMS MANAGEMENT TECHNIQUES equal to the bit string of the query. The reader has queued Q, which maintains bit strings for queries.

66 CHAPTER 3. RFID DATA STREAMS MANAGEMENT TECHNIQUES equal to the bit string of the query. The reader has queued Q, which maintains bit strings for queries. At the beginning of the frame, Q is initialised with queries of all the leaf nodes in the tree of the last frame. AQS keeps information that is acquired during the last identification process, in order to shorten the collision period. This technique also requires tags to support both the transmission and reception at the same time as in ABS. In addition, according to Bhatt and Glover (2006), the adaptive splitting protocol is only compatible with EPC class 0 and class 1 generation 1; and is more complex than basic BTS and QT. Figure 3.7 shows that ABS (Refer to Tree Splitting Section) and AQS start the tree search from the leaf nodes of the tree from the last frame. Figure 3.7: The starting point of tag identification in tree-based protocols Hybrid Query Tree Hybrid Query Tree (HQT) utilises a 4-ary query tree instead of a binary query tree (Ryu et al., 2007). Figure 3.8 shows an example of identification process between QT (a) and HQT (b). This technique increases too many idle cycles despite reducing collision cycles, while extra memory needed also increases, as an identification process gets longer. This is because each query increases the prefixes by 2-bits instead of 1-bit. Table 3.4 compares operations of the QT protocol and HQT protocol from Figure 3.8 sample. As we can see, HQT reduces the number of query commands as well as the number of collisions. However, the number of idle cycles had been increased as a side-effect. There is a basic Idle cycles elimination (slotted back-off tag response mechanism) for HQT, but this requires more time and memory. The extended version of HQT also requires extra memory, since it mimics the AQS for the last identification information to be kept. Nevertheless, HQT is better than QT in reducing collision between tags, especially at higher number of tags. 40

67 3.3. DETERMINISTIC ANTI-COLLISION PROTOCOLS Figure 3.8: Tree-based protocols: a) Query tree protocol, b) 4-ary tree protocol Table 3.4: Identification process of Query Tree versus Hybrid Query Tree Query Tree Protocols Step Query string Query result Query queue 1 0 collision 1,00, collision 00,01,10, successful 01,10, successful 10, collision 11,100, idle 100, successful successful empty Hybrid Query Tree Protocols Step Query string Query result Query queue 1 empty string collision 00,01, successful 01, successful collision 1000,1001, successful 1001, idle successful empty Other Query Tree-Based Algorithms There are other improved version of QT, which enhances the performance but increases implementation cost, due to the more complex execution algorithms. These techniques include the Improved QT (ImpQT) algorithm (Zhou et al., 2004), the QT-based Reservation (QTR) algorithm (Choi et al., 2007), and the Intelligent Query Tree (IntQT) (Bhandari et al., 2006). For the deterministic anti-collision approaches, it is preferred that the algorithms are simple, since the adoption of the tree-based techniques are in the older RFID system. The recent technology uses ALOHA-based anti-collision algorithms rather than the tree-based. From the observation and literature survey, we discover that for tree-based approaches, 41

68 CHAPTER 3. RFID DATA STREAMS MANAGEMENT TECHNIQUES Figure 3.9: A sample procedure of Frame-slotted ALOHA the number of identification cycles, the total memory bits required, and the similarity of IDs, mostly affects the delay of tags identification. Therefore, we make the assumption that by taking advantage of EPC pattern and bulky movement of items (see Chapter 2), the identification ability of the reader can be improved without the need for complex anti-collision algorithm. 3.4 Probabilistic Anti-Collision Protocols In a probabilistic approach, tags respond to readers at randomly generated times. If a collision occurs, colliding tags will have to identify themselves again after waiting a random period of time (Choi and Lee, 2007; Li et al., 2009; Bang et al., 2009; Klair et al., 2010). When we mentioned the probabilistic anti-collision approach in RFID, we usually refer to the ALOHA-based approach, which is the most widely used type of anti-collision. Slotted ALOHA (Quan et al., 2006), which initiates discrete time-slots for tags to be identified by reader at the specific time, was first employed as an anti-collision method in an early days of RFID technology. The principle of Slotted ALOHA techniques is based on the Pure ALOHA introduced in early 1970s (Abramson, 1970), where each tag is identified randomly. To improve the performance and throughput rate, different anti-collision schemes were suggested in the past literature. Framed-Slotted ALOHA technique is the most improved ALOHA-based technique currently applied in many applications. The three most accepted Framed-Slotted ALOHA techniques are Basic Framed-Slotted ALOHA, Dynamic Framed-Slotted ALOHA, and Enhanced Dynamic Framed-Slotted ALOHA. Several researchers (Wang et al., 2007; Lee et al., 2005b, 2008a; Cho et al., 2007) have also attempted to improve the throughput rates by implementing a more accurate Frame-size Estimation algorithms. Figure 3.9 shows an example of Frame-slotted ALOHA anti-collision protocols. Each frame is formed of specific number of slots that is used for communication between the readers and the tags. Any slot that has more than one tags responding to it is classified as a collision slot, while any slot that has exactly one tag responding to it is a successful slot. Empty slot occurs when no tag respond within that specific time slot. Figure

69 3.4. PROBABILISTIC ANTI-COLLISION PROTOCOLS shows that Slot 1 and 2 of Frame one and, Slot 5 of Frame two, are collision slots; Slot 3 of Frame one, Slot 4 of Frame two, and Slot 6 of Frame three, are successful slots; and Slot 7 of Frame three is an empty slot. The rest of this section describes each type of ALOHA-based anti-collision algorithms and their benefits and limitations BFSA Method The Basic Framed-Slotted ALOHA (BFSA) is the most basic ALOHA-based algorithms that use a fixed frame-size throughout the identification round. The reader offers information to the tags, including the frame-size specification and the random number selected by each slot within the frame. Each tag selects a slot using the random number and then sends its ID back to the reader (Ding and Liu, 2009; Lee and Lee, 2006; Lee et al., 2008b). Since the frame-size of the BFSA is fixed, its implementation is simplistic. However, the system s efficiency drops significantly in the event of there being too large or too small tag counts. For instance, no tag may be identified in a read cycle if there are too many tags within the interrogation zone. On the other hand, under small tag counts where large frame-size is used, lots of empty slots are produced resulting in decreased system efficiency DFSA Method The Dynamic Framed-Slotted ALOHA (DFSA) overcomes the problems associated with BFSA, by dynamically changing the frame-size according to estimated number of Backlog, which is a number of tags that have not been read. In DFSA, each tag in an interrogation zone selects one of the given N slots to transmit its identifier; and all tags will be recognised after a few frames. Each frame is formed of specific number of slots that is used for communication between the readers and the tags. To determine the frame-size, it gathers and uses information such as number of successful slots, empty slots, and collision slots from previous round, to predict the appropriate frame-size for the next identification round (Ding and Liu, 2009; Lee and Lee, 2006; Devarapalli et al., 2007; Fan et al., 2008a). DFSA can identify the tag efficiently because the reader adjusts the frame-size according to the estimated number of tags. However, the frame-size change alone cannot sufficiently reduce the tag collision when there are a number of tags because it cannot increase the frame-size indefinitely. DFSA has various versions depending on different tag estimation methods used. There have been several researches to improve the accuracy of frame-size by implementing a frame-size estimation techniques (Lee et al., 2005b, 2008a; Cho et al., 2007; Fan et al., 2008b). According to the DFSA protocol, the reader picks tag within an interrogation zone by the command Select, then issues Query, which contains a Q parameter to specify the frame-size (frame-size F = 2 Q - 1). Each selected tag will pick a random number between 0 to 2 Q - 1 and put it into its slot counter. The tag, which 43

70 CHAPTER 3. RFID DATA STREAMS MANAGEMENT TECHNIQUES picks zero as its slot number, will respond and backscatter its EPC to reader. Then, reader issues QueryRep or QueryAdjust command to initiate another slot (Wang et al., 2007; Zhu and Yum, 2009). Similar to the Tree-based anti-collision, there are three kinds of slot in ALOHA-based anti-collision, as shown in Figure 3.10: 1) Empty slot where there is no tag reply; 2) Successful slot where there is only one tag reply; and 3) Collision slot where there is more than one tag reply. The term initial Q refers to the first Q or frame-size, which applies to a specific identification cycle. In Figure 3.10a), the reader first initiates a Query and broadcasts the signal to nearby tags. Since there is no tag that picks zero as its slot counter, the slot is counted as an empty slot. Figure 3.10b) shows that, after the first Query was sent, each tag deducted its slot counter by one. The reader then sends QueryRep to tags in close proximity; and any tag that has zero as its slot counter replies. When there is only one tag that responds, a successful slot occurs and the tag replies to the reader with its RN16. Figure 3.10c) demonstrates that when two tags respond to the reader at the same time, a collision slot occurs and in this case, no information is transmitted. Figure 3.10: Empty Slot, Successful Slot, and Collision Slot in EPC Class 1 Generation 2 Protocol 44

71 3.4. PROBABILISTIC ANTI-COLLISION PROTOCOLS EDFSA Method The DFSA algorithms change the frame-size to increase the performance efficiency of the tag identification. However, as the number of tags becomes larger than the frame-size, the probability of collision increases rapidly. If the number of unread tags can be estimated accurately, frame-size can be determined to maximise the system efficiency or minimise the tag collision probability. For instance, when the number of tags is large, the probability of tag collision can be reduced by increasing the frame-size. However, the frame-size cannot be increased indefinitely. When the number of unread tags is too large to achieve high system efficiency, the number of responding tags somehow must be restricted so that the optimal number of tags responds to the given frame-size (Lee et al., 2005b; Lee and Lee, 2006). The Enhanced Dynamic Framed-Slotted ALOHA (EDFSA) first estimates the number of unread tags. If the number of tags within the interrogation zone is larger than the maximum frame-size, the EDFSA algorithm splits the number of Backlog into number of groups and allows only one group of tags to respond. When the reader limits the number of responding tags, it transmits the number of tag sets and a random number to the tags, when it issues the query. Only the tag that picks zero as its slot counter responds to the request. If the number of estimated Backlog is below the threshold, the reader adjusts the frame-size without grouping the unread tags. After each read cycle, the reader estimates the number of unread tags and adjusts its frame-size. This procedure repeats until all the tags are read (Lee et al., 2005b; Lee and Lee, 2006). Table 3.5 shows the derived rule for tag grouping of EDFSA method. The table demonstrates that if the number of estimated unread tags is equal to or less than 354 tags, the EDFSA algorithm will not split tag into group. However, according to the rule, if there are more than 354 tags remaining in the interrogation zone, the EDFSA algorithm will split unread tags into groups. For instance, if there are 1245 estimated remaining tags, the EDFSA algorithm will divide tag into four groups (refer to the rule in Table 3.5). The problem with EDFSA method is that it assumes that 256 is the optimal frame-size and splits tags into group by using the power of two (2,4,8...). This results in decreased system efficiency when the number of tags is just above the threshold and the number of group doubled Other ALOHA-Based Methods There has been a number of methodologies proposed to improve the performance efficiency of ALOHA-based anti-collision methods. This includes partitioning algorithms (Shin and Kim, 2007; Kim, 2008), which have claimed to have had higher efficiency than the EDFSA approach, but lacks signaling robustness. Despite the wide array of approaches, only the BFSA, DFSA and EDFSA methods (Klair et al., 2010; Cheng and Jin, 2007) are com- 45

72 CHAPTER 3. RFID DATA STREAMS MANAGEMENT TECHNIQUES Table 3.5: EDFSA Rule - The number of unread tags, optimal frame-size, and number of group The number of unread tags Frame-Size Group to to to to to to to to monly used for comparative analysis in past literature. Additionally, Backlog estimation approaches have also been a popular research topic in this domain Backlog Estimation Techniques In order to predict accurate number of unread tags and to determine the new framesize for the next identification round, BFSA, DFSA, and EDFSA algorithms gather and use information such as number of successful slots, empty slots, and collision slots from previous round. There have been several other methods mentioned in literature related to Backlog estimation, including Schoute method (Schoute, 1983), Lowerbound method, Chen1 and Chen2 methods (Chen, 2006), Vogt method (Vogt, 2002), and Bayesian method (Floerkemeier, 2007). Some of these methods are either having worse performances than simple Schoute and Lowerbound methods, or are too complicated to be implemented for RFID system. These methods are explained as follows: Schoute Backlog Estimation Technique Schoute (1983) developed a Backlog estimation technique for Dynamic Framed-Slotted ALOHA using Poisson distribution. The Backlog, after the current frame Bt, is given by equation: Bt = 2.39 c Where c represents the number of collided slot in the current frame, and Bt represents the remaining Backlog. This technique has the best performance where fewest frames were used, compared with other algorithms. Schoute method is the simplest, easy to implement with low overhead computation, and provides accurate tag estimation. 46

73 3.4. PROBABILISTIC ANTI-COLLISION PROTOCOLS Lowerbound Backlog Estimation Technique The Lowerbound estimation function is obtained under the assumption that a collision involves at least two different tags. Therefore, Backlog after the current frame Bt is defined by equation: Bt = 2 c Where c is the number of collided slot in the current frame, and Bt represents the remaining Backlog. Lowerbound method is also simple, easy to implement with low overhead computation, and provides accurate tag estimation Chen1 and Chen2 Estimation Techniques Most of the static algorithms estimate the Backlog with the number of collided slot. However, Chen1 method (Chen, 2006) estimates the Backlog, based on the empty slot information, through the probability of finding h empty slots after completing a frame. Chen2 method (Chen, 2006) is a simpler way to estimate the number of tags, which is illustrated by the following equation: n = (L 1) s h Where n is the number of Backlog, L is frame length, s is the number of successful slots, and h is the number of empty slots. If h = 0, n is set to a certain upper bound for the tag s estimate. According to Wang et al. (2007), Chen1 and Chen2 methods have worse performances than simple Schoute method. Chen1 method also requires complex computation, which leads to high overhead and delays the tag identification process Vogt Estimation Techniques In (Vogt, 2002), a procedure to estimate Backlog is presented by minimising the difference between the observed value, including number of empty slot h, successful slot s, collision slot c, and the expected value E(H), E(S), E(C). In order to find the comparative precise Backlog, the reader needs to resolve the equation below: 47

74 CHAPTER 3. RFID DATA STREAMS MANAGEMENT TECHNIQUES min h s c E N (H) E N (S) E N (C) Vogt method presents the most accurate tag estimation. However, the complexity of the algorithm resulted in high overhead and therefore cannot be applied to EPC Gen2 protocol (Lee et al., 2008b) Bayesian Estimation Techniques Bayesian method (Floerkemeier, 2007; Wu and Zeng, 2010) first computes the frame size L, based on the current probability distribution of the random variable N that represents the number of tags transmitting. Then it starts frame with L slots, waits for tag replies, and updates probability distribution of N, based on evidence from the reader at the end of the frame. The evidence comprises the number of empty slots, successful slots, and collision slots in the last frame. The method then adjusts probability distribution N by considering newly arrived tags and departing tags, including the ones that successfully replied and did not transmit in subsequent slots. Bayesian method requires the most complex computation and implementation of algorithm. This results in high overhead and therefore delays the identification process. 3.5 Discussion In this section, we analyse the shortcomings of existing data stream filtering and anticollision methods. We also identify research issues, and outline the research problems being investigated in this thesis Limitations of Existing Methods There are approaches previously discussed that can be applied to handle unreliable reads, noises and duplications. However, many challenges remained for missed reads, which is the most crucial issue in RFID applications, and is the hardest to identify and filtered. Filling in dropped readings is one way to alter missed reads, but it is easier to fix the error from the source where data is missing in the first place. The cause of these missed reads is the RF collision, which occurs when two or more tags attempt to respond to a reader at the same time. To solve RF collision problem, several anti-collision protocols are proposed in the literature. However, these approaches still suffer from performance inefficiency, high delay in identification time, and overhead computation of algorithms. 48

75 3.5. DISCUSSION Limitation on Data Stream Filtering Techniques Literature surveys on unreliable reads demonstrate that, if physical equipment including tags, readers, and antennas are set up accordingly, the unreliable reads can be avoided in most cases. The environmental selection for the RFID hardware deployment is also very crucial to minimise the fault readings. Noise readings can be simply filtered by scheduling a specific sliding window, which expires over a certain time. If the count of a certain tag falls below the threshold, it is safe to assume that the tag is outside the normal reading zone and is accidentally captured by the reader. Duplication at reader level can be avoided using the algorithm that identified the unnecessary readers, and disabled them to minimise the number of reader within one reading zone. In addition, duplication at data level can be filtered using several techniques proposed in literatures. For instance, if the same set of tags are captured several times within a specific time-frame, it is safe to assume that these tags are redundant and must be removed. As for missed reads, which is the most critical issue in RFID applications, several techniques have been proposed to surrogate the missing data. However, it is preferred that the missed read does not occur from the beginning. The common cause of missed reads is the RF collision, which can be solved by applying anti-collision protocols to prevent two or more tags from communicating to a reader at the same time. The two types of tag anti-collision algorithms accepted and widely used in RFID systems are the tree-based anti-collision, and the ALOHA-based anti-collision techniques Limitation on Tree-based Anti-Collision Techniques There are several tree-based anti-collision techniques that can effectively prevent tag collisions. Most memory anti-collision algorithms including Binary Search, Bit Arbitration, and Tree Splitting, require higher computational complexity compared with the memoryless Query Tree. Nevertheless, some techniques from QT category still have drawbacks and limitations, as described below: Query Tree protocols suffer from a long identification delay in the case where there are a large number of tags within an interrogation zone. The delay is also caused by similarity of ID and mobility of tags, where tags are not static (stay at the same spot at all time). Adaptive Query Splitting technique introduces more complexity than QT because information on last identification must be kept, in order to accelerate identification process. This technique also requires tags to support both the transmission and 49

76 CHAPTER 3. RFID DATA STREAMS MANAGEMENT TECHNIQUES reception at the same time, thereby making it difficult to apply to low-cost passive RFID systems. Hybrid Query Tree reduces collision cycles by querying 2-bits of prefixes for each loop instead of 1-bit as in QT. However, it produces even more idle cycles than QT because at one level it generates 4-leaf nodes, especially if there are not many tags in the interrogation zone. There is a basic Idle cycles elimination (slotted back-off tag response mechanism) for HQT, but this requires more time and memory. The extended version of HQT also requires extra memory since it mimics the AQS based for last identification information to be kept. However, HQT is better than QT in reducing collision between tags, especially at higher number of tags Limitation on ALOHA-based Anti-Collision Techniques ALOHA-based anti-collision technique is the most widely used type of anti-collision within the probabilistic category. The earlier type of ALOHA anti-collision such as Pure ALOHA and Slotted ALOHA performe poorly, while the more advanced Framed- Slotted ALOHA has better performance. However, some techniques from Framed-Slotted ALOHA category still have drawbacks and limitations, as described below: Basic Framed-Slotted ALOHA has the worst performance compared with other Framed-Slotted ALOHA methods. This approach suffer from inaccurate frame-size for each round of identification because it uses fixed frame-size. Therefore, the system s efficiency drops significantly in the event of there being too large or too small tag counts. Dynamic Framed-Slotted ALOHA suffers from different level of insufficiency, depending on frame-size prediction technique applied. If the number of unread tags are not estimated accurately, correct frame-size cannot be determined to maximise the system efficiency or minimise the tag collision probability. Thus, the performance of DFSA depends highly on the selection of frame-size estimation technique. Enhanced Dynamic Framed-Slotted ALOHA assumes that the optimal frame-size is fixed to 256. The number of group in EDFSA increases, using the power of two (2,4,8...), which results in decreased system efficiency when the number of tags is just above the threshold, and the number of group doubled-up. Current Backlog Estimation methods suffer from low performances or are too complicated to be implemented for RFID system. Schoute s method is the simplest, easy to implement with low overhead computation, and provides accurate tag estimation. Other Backlog Estimation methods such as Chen1 and Chen2 methods, 50

77 3.5. DISCUSSION Vogt method, and Bayesian method, have good simulated performance but cannot be realistically applied to the actual passive RFID system. Specifically, the Bayesian method requires the most complex computation and implementation of algorithm. Overall, the literature review on current state-of-the-art techniques demonstrates that some of the existing techniques are inefficient, while other methods are too complex with high overhead cost of implementation. Some approaches cannot be further improved but we can take advantage of other constraints, to improve their capability. For instance, a basic tree-based methods such as QT (2-ary) and HQT (4-ary) are the best naive treebased methods but cannot be improved any further, in terms of simplicity, without the need for complex algorithm. Thus, we need to take advantage of other constraints such as EPC pattern, and a possible use of a combination of two trees, in order to improve memory and power efficiency. Additionally, for probabilistic anti-collision, the DFSA method is the simplest and most accurate method. However, in this case, to keep the simplicity of the DFSA algorithm, only frame-size prediction scheme can be further improved Research Problem It remains an open problem to find optimal solutions, to improve performance of the current RFID anti-collision techniques. Two main goals for both tree-based deterministic and ALOHA-based probabilistic anti-collision methods are to achieve the maximum efficiency and to minimise identification time and resource wasted during the identification process. Structuring anti-collision methods in RFID system is extremely important because it is a step that determines the effectiveness and the overall quality of data captured. In this thesis, the research problem is to investigate a suitable structure of tree-based deterministic and ALOHA-based probabilistic anti-collision approaches such that new efficient methods can be developed to improve performance of anti-collision technique in RFID system. Given the limited resources of RFID components including the readers and the tags, it is important to develop the anti-collision method that minimises power and memory usage in the RFID reader, and to simplify the structure of algorithm so that identification time can be minimised. There are two main constraints in developing effective anti-collision algorithms. These include limited power source from RFID reader and limited memory in both readers and tags. By constructing complex anti-collision algorithms, high memory capacity and power sources are needed, which is impractical in RFID system. Therefore, our aim is to develop anti-collision schemes that are simple, with low overhead computation, and perform effectively, compared with existing techniques. To address our research problem, we compare our newly proposed methods to specific existing approaches, which have simple algorithm structure, high robustness, and accomplished high performance with minimum time requirement. We also compare our proposed 51

78 CHAPTER 3. RFID DATA STREAMS MANAGEMENT TECHNIQUES tree-based method and ALOHA-based method, and analyse the benefit and detriment of both methods toward different circumstances. Additionally, we introduced two novel conceptual selective technique management, to employ the correct type of anti-collision algorithm for specific scenario. 3.6 Summary In this chapter, we have investigated different types of data stream errors including duplication, noise, unreliable reads, and missed reads. Particular attention was given to the filtering of missed reads, which is the most crucial type of data stream error, mainly caused by collision. We first described existing data stream filtering methods and anti-collision methods, and categorised them into different classes. We then discovered and analysed the shortcoming and limitations of these methods. Furthermore, we identified interesting research issues concerning tag anti-collisions for RFID data stream. These research problems are then addressed in this thesis. The next three chapters describe a detailed investigation of tree-based and ALOHAbased anti-collision, and the selective technique management. 52

79 4 Deterministic Anti-Collision Approaches In this chapter, we tackle problems on existing deterministic tree-based anti-collision schemes including the amount of identification cycles produced and total memories used during the identification process. We introduce two main methods derived from the fundamental of tree-based anti-collision protocols; 1) a Unified Q-ary Tree (Pupunwiwat and Stantic, 2009a), (Pupunwiwat and Stantic, 2009b) and 2) a Joined Q-ary Tree (Pupunwiwat and Stantic, 2010c) with the intended goal to minimise memory usage queried by the RFID reader. As mentioned in literature, most implementation of Tree-based algorithms are deployed with older type of EPC class 1, which has limited memory and capability. Although recent technology uses ALOHA-based anti-collision algorithms rather than the Tree-based, we decided to improve the Tree-based approach with simple implementation in order to suit the backward compatibility for older RFID systems. The remaining of this chapter comprises the explanation on different types of EPC encoding schemes, the typical scenarios discussion, the Splitting Fitness justification, the foundation of Unified Q-ary Tree and Joined Q-ary Tree, and the experimentation evaluations. 4.1 EPC Encoding Schemes Analysis There are many types of encoding schemes compatible with RFID passive tags. The most common type of encoding is the General Identifier 96 bits scheme, which is independent of any existing identity specification or convention and can be used in most events. There are also different types of encoding designed specifically for special instances such as Serialised Global Trade Item (SGTIN) 96 bits, which permit the direct embedding of GS1 System 53

80 CHAPTER 4. DETERMINISTIC ANTI-COLLISION APPROACHES standard GTIN and serial number codes on EPC tags (EPCGlobal, 2008). In this thesis, we observe three different types of encoding schemes and conduct experiments to find impacts of each type of encoding. For the main part of our methodology, we use General Identifier as our major encoding scheme since it is the most general form of encoding. We select the other two encoding schemes based on their bit lengths and number of fields. The three chosen encoding schemes are explained in the following subsections General Identifier 96 Bits The General Identifier (GID) is defined for a 96-bit EPC, and is independent of any existing identity specification or convention. In addition to the Header which guarantees uniqueness of the encoding type, the General Identifier is composed of three fields; the General Manager Number (GMN), Object Class (OC) and Serial Number (SN), as shown in Table 4.1. Table 4.1: The GID-96 includes three fields in addition to the Header, with a total of 96-bits binary value. Only H is shown in Binary, while the rest are shown in Decimal GID-96 Bit Max.Decimal/Binary Header (H) General Manager Number (GMN) ,435,455 Object Class (OC) 24 16,777,215 Serial Number (SN) 36 68,719,476,735 Table 4.1 shows an example of GID-96 EPC generation 2 encoding scheme. The general structure of EPC tag encodings is a string of bits, consisting of a fixed length (8-bit) Header followed by a series of numeric fields whose overall length, structure, and function are completely determined by the Header value. There are four major fields in the GID-96 bits. Table 4.2: The SGTIN-96 includes six fields with a total of 96-bits binary value. Only H is shown in Binary, while the rest are shown in Decimal SGTIN-96 Bit Max.Decimal/Binary Header (H) Filter Value (FV) 3 Various depend on type 000, 001, 010, 011, 100, 101, or 110 Partition (PT) 3 Various depend on type Refer to Table 4.3 Company Prefix (CP) , ,999,999,999 Item Reference (IR) ,999,999-9 Serial Number (SN) ,877,906, Serialised Global Trade Item Number 96 Bits The EPC tag encoding scheme for Serialised Global Trade Item Number (SGTIN) permits the direct embedding of GS1 System standard GTIN and serial number codes on EPC tags. 54

81 4.1. EPC ENCODING SCHEMES ANALYSIS In addition to a Header, the SGTIN-96 is composed of five fields: the Filter Value (FV), Partition Value (PV), Company Prefix (CP), Item Reference (IR), and Serial Number (SN), as shown in Table 4.2. Table 4.2 shows an example of SGTIN-96 EPC generation 2 encoding scheme. The general structure of EPC tag encodings is a string of bits, with a Header in binary value of The FV is not part of the SGTIN pure identity but is an additional data that is used for fast filtering of basic logistics types. The available values of PV and the corresponding sizes of the CP and IR fields are defined in Table 4.3. The CP contains a literal embedding of the GS1 company prefix and the IR contains a literal embedding of the GTIN item reference number. Finally, the SN contains a unique serial number of each individual item being tagged by the RFID passive tag. Table 4.3: SGTIN-96 and GIAI-96 Partitions in bits Partition Company Prefix Item Reference Individual Asset Reference (PT) (CP) (IR) - SGTIN96 (IAR) - GIAI Table 4.3 shows a SGTIN-96 bits and GIAI-96 (explained in the next subsection) partition values. The Partition is an indication of where the subsequent CP and IR or IAR are divided Global Individual Asset Identifier 96 Bits The EPC tag encoding scheme for Global Individual Asset Identifier (GIAI) permits the direct embedding of GS1 System standard GIAI codes on EPC tags. In addition to a Header, the EPC GIAI-96 is composed of four fields: the FV, PV, CP, and Individual Asset Reference (IAR), as shown in Table 4.4. Table 4.4: The GIAI-96 includes five fields with a total of 96-bits binary value. Only H is shown in Binary, while the rest are shown in Decimal GIAI-96 Bit Max.Decimal/Binary Header (H) Filter Value (FV) 3 Various depend on type 000, 001, 010, 011, 100, 101, or 110 Partition (PT) 3 Various depend on type Refer to Table 4.3 Company Prefix (CP) , ,999,999,999 Individual Asset Reference (IAR) ,611,686,018,427,387,903-4,398,046,511,103 55

82 CHAPTER 4. DETERMINISTIC ANTI-COLLISION APPROACHES Table 4.4 shows an example of GIAI-96 EPC generation 2 encoding scheme. The general structure of EPC tag encodings is a string of bits, with a Header in binary value of The FV is not part of the GIAI pure identity but is an additional data that is used for pre-selection of basic logistics types. The available values of PV and the corresponding sizes of the CP and IAR numbers are defined in Table 4.3. The CP contains a literal embedding of the GS1 company prefix and the IAR is a mandatory unique number for each individual instance. The difference between the SGTIN-96 bits and the GIAI-96 bits is that the GIAI- 96 contains one less field compared with SGTIN-96, since it uses only IAR as a unique identifier instead of IR and SN. Thus, we chose the two encoding schemes for our experimentation evaluation to see if there is any impact on our proposed tree-based anti-collision methods. For the remaining of this chapter, all samples and methodology clarification will use GID-96 bits encoding scheme, and the SGTIN-96 bits and the GIAI-96 bits will be used in our experiment. 4.2 Warehouse Distribution Scenarios In this chapter, we are examining specific scenarios based on the assumption that items tend to move and stay together through different locations especially in a large warehouse. We focus on Crystal warehouse scenario using GID-96 bits encoding scheme, which can be classified into four different scenarios: 1) Unique Item-Level, 2) Unique Container-Level, 3) Unique Company-Level, and 4) Unique Warehouse-Level. Figure 4.1: Crystal Warehouse Scenario: a) Unique Item-Level, b) Unique Container- Level, c) Unique Company-Level, and d) Unique Warehouse-level Unique Item-Level Scenario This scenario occurs when two collided tags (GID-96 encoding) are captured and they have the same Encoding Scheme/Header (=), same GMN (=), same OC (=), but different SN 56

83 4.2. WAREHOUSE DISTRIBUTION SCENARIOS ( ). We can assume that all items are from the same warehouse that uses the same encoding scheme throughout the warehouse, and the warehouse also keeps different kind of products from different companies. Figure 4.1 illustrates a sample Crystal warehouse scenario where: a) Unique Item-Level: Two containers of crystal red-wine have the same Header (=), GMN (=), and OC (=), but different SN ( ) b) Unique Container-Level: Crystal white-wine and crystal red-wine containers have the same Header (=) and GMN (=), but different OC ( ) and SN ( ) c) Unique Company-Level: Crystal white-wine and crystal plate containers have the same Header (=), but different GMN ( ), OC ( ), and SN ( ) d) Unique Warehouse-level: Crystal plate and plastic plate containers have different Header ( ), GMN ( ), OC ( ), and SN ( ) As for Unique Item-Level circumstance, by using the Crystal warehouse scenario example from Figure 4.1a), it can be seen that two collided tags are captured with the same Encoding Scheme, General Manager Number, and Object Class. We believe that both tags are each attached to two different cases of red-wine Unique Container-Level Scenario The Unique Container-Level Scenario takes place when two collided tags are captured and they have the same Header (=), same GMN (=), different OC ( ), and different SN ( ). Figure 4.1b) shows that crystal red-wine glasses and crystal white-wine glasses are packed in different case and pallet because they are different type of wine glasses. Within this scenario, each case of wine glasses will have a unique SN attached to it, with different OC for each pallet of white-wine or red-wine Unique Company-Level Scenario The Unique Company-Level Scenario is illustrated in Figure 4.1c). Two collided tags are captured and they have the same Header (=), and unique GMN ( ), OC ( ), and SN ( ). We believe that one tag is attached to crystal plate case, while the other tag is attached to white-wine case. We can assume that there are two different companies producing separate crystal ware; and the wine glasses and plates are from different companies but share the same warehouse because they are both crystal. 57

84 CHAPTER 4. DETERMINISTIC ANTI-COLLISION APPROACHES Unique Warehouse-Level Scenario Unique Warehouse-Level Scenario occurs when two collided tags are captured and they have different Header ( ), GMN ( ), OC ( ), and SN ( ). We can assume that all items are from different companies that use different encoding schemes. For example, Figure 4.1d) shows that two wine glasses with different sculpture, one made from crystal and the other from plastic, are allocated in the same warehouse. This Unique Warehouse-Level scenario will not be discussed any further in this chapter because we are only looking at a large warehouse distribution where most items move together as a group. Therefore, most items from the same type of manufacturing will stick together until they are deployed to smaller retailer. 4.3 Splitting Fitness Splitting Fitness is the measurement level for the performance of our proposed tree-based anti-collision methods. Splitting Fitness can be classified into Worst-Case splitting, Perfect splitting, and Random splitting. All three cases are discussed in the following subsections. Figure 4.2: Splitting Fitness: a) Worst-Case Splitting, b) Perfect Splitting, and c) Random Splitting Worst-Case Splitting Worst-Case splitting is when tags spliced into an unbalanced tree, where one child node has no further node in a binary tree case. Figure 4.2a) shows that there are 16 tags at Level 0 tree; then at Level 1, tags spliced into 16 tags on the left-hand node and no tag on the right-hand node. As there is no tag left, no further splitting is necessary on the right-hand node. This case of splitting will likely happen for the first few bits of EPC identification in real world warehouse environment because most items have Massive tag movement and usually belong to the same EPC pattern with similar ID. The Worst-Case 58

85 4.3. SPLITTING FITNESS splitting caused more Idle cycles because all tags will be traveling down to only one side of the tree, which results in further collision Perfect Splitting Perfect splitting happens when a set of tags spliced to the left and right child node equally. Figure 4.2b) shows that there are 16 tags at Level 0 tree; then at Level 1, tags spliced equally into 8 nodes. Further splitting is required for both left-hand and right-hand nodes until only one tag is left. This case of splitting is almost impossible in real world scenario but will be the closest case to the latter stages (bits) of EPC identification within warehouse environment because most items belong to the same group of EPC pattern. For example, one pallet of white-wine glasses containing 20 cases move into one interrogation zone. All items from the pallet will have the same OC and will travel along the same side of child node at earlier levels; resulting in Worst-Case Splitting. However, the remaining few bits will be unique for each EPC because they belong to SN. These remaining bits encoded within the same EPC pattern will split almost equally to the left and right child nodes. Both child nodes of left-hand side and right-hand side of binary tree will not be exactly equal since data captured are not always even. Therefore, we call this situation Partial-Perfect splitting. Figure 4.3: A sample of 16 tags from the same pallet with the same Object Class and 16 unique Serial Numbers Figure 4.3 shows an example of warehouse environment tags splitting process. It can be seen that there are 16 tags at the start; then at Level 1 to Level 3, all tags only move down the left-hand child node while no tag moves down the right-hand node. We can make the assumption that all of these tags have the same Header, same GMN, and same OC. However, from Level 4 onward, tags start splitting into the left and right child nodes 59

86 CHAPTER 4. DETERMINISTIC ANTI-COLLISION APPROACHES of binary tree, which means that the query reached unique bits of tags that are SN so the EPC ID becomes similar but no longer identical. At Level 7, all 16 tags are successfully identified. Note that this Figure only shows an ideal sample of the identification process. However, in the real world scenarios, tags will most likely spliced into Partial-Perfect splitting from Level 4 to Level Random Splitting Random splitting happens when a set of tags spliced to the left and right child node randomly and splitting pattern cannot be found. Figure 4.2c) shows that there are 16 tags at Level 0 tree; then at Level 1, tags spliced into 5 and 11 tags. Further at Level 2, tags spliced into 2, 3, 4, and 7 where no specific splitting pattern exists. Thus, this situation is called Random Splitting, which will likely happen in retail distribution environments (belong to Unique Warehouse-Level Scenario) because all items usually come from different locations. Therefore, this splitting case will not be further discussed as this chapter will only focus on warehouse environment. 4.4 Unified Q-ary Tree Instead of using a plain Q-ary tree, which uses every 1 (2-ary), 2 (4-ary), 3 (8-ary), or 4 (16- ary) bits of tag ID to split a tag set, we propose a Unified Q-ary Tree or a combination of two Q-ary trees (12 combinations), which can reduce more collision and at the same time, memory usage can be minimised. For example, we can combine 4-ary tree with 8-ary tree and apply this anti-collision to 96-bits EPC where the challenge is to configure the right partition, so that 4-ary tree can be applied to the first half bits of EPC and 8-ary tree can be applied to the remaining bits. The remainder of this section will focus on two approaches: 1) a Naive approach, where Q-ary tree is non-unified and only a single Q-ary tree is used as an anti-collision; and 2) a Unified approach, where two Q-ary trees are combined as an anti-collision with 12 possible combinations Unified Q-ary Tree Fundamental Our proposed Unified Q-ary Tree combined 2-ary, 4-ary, 8-ary, and 16-ary tree together with 12 possible combinations. This approach will be applied on each collided tags EPC, which will be split using every 1, 2, 3, or 4-bits of tag ID for the first few queries; and then at one point every 1, 2, 3, or 4-bits will be queried. With the fact that most items from a warehouse have bulky movement, the first few bits of EPC will be identical. Based on the GID-96 bits encoding scheme, the first 8-bits of EPC are Header, which will be the same for all items using the same encoding and they usually came from the same company and 60

87 4.4. UNIFIED Q-ARY TREE in the same pallet. These 8-bits of EPC can be bypassed faster using 4-ary tree instead of 2-ary tree but by doing so, too many Idle cycles will be produced. By using 4-ary tree instead of 2-ary tree, the Number of bits needed for each query also accumulates faster. Thus, we need to optimise the performance of Unified Q-ary Tree by configuring the right separating point between the two Q-ary trees. The objective of Unified Q-ary tree is to maintain the minimal number of cycles, and to minimise the Number of bits used for querying all tags within an interrogation zone in order to improve the overall identification time. Figure 4.4 shows the example of the Naive 4-ary tree (4.4a) and the Unified 4-ary & 8-ary tree (4.4b). Figure 4.4: A sample of: a) Naive 4-ary Tree, and b) Unified 4-ary & 8-ary Tree Table 4.5: The Unified Q-ary Tree can be merged into twelve different combinations. 1, 2, 3, and 4 represent the Number of bits queries each time for splitting tags when collision occurred 2-ary 4-ary 8-ary 16-ary F S F S F S F S 2-ary ary ary ary For Number of cycles and Number of bits computation purposes, let F be the first half of EPC where bits are identical; and let S be the second half of EPC where bits are unique. Table 4.5 shows possible combinations between four of the Q-ary trees; 2-ary, 4-ary, 8-ary, and 16-ary Computation of Naive approach and Unified approach To observe the difference between the processes of the Naive approach versus the Unified approach, we initiate a computational process between the two approaches using two Naive and two Unified Q-ary Trees as an example. The Naive 2-ary tree, Naive 4-ary tree, Unified 2-ary & 4-ary tree, and Unified 4-ary & 2-ary tree, are selected for the sample case. 61

88 CHAPTER 4. DETERMINISTIC ANTI-COLLISION APPROACHES Figure 4.5 shows a comparison between Unified approach (2-ary & 4-ary, 4-ary & 2- ary) and Naive approach (4-ary, 2-ary) on the five EPC data. We can see that the Naive 4-ary tree has the shortest level of tree; however, by examining Table 4.7, 4-ary tree does not have the lowest Total number of bits. This proves that levels of tree have an impact on the Total number of bits and Overall cycles, but does not necessarily result in the best performance of tree. Figure 4.5: Identification processes of: a) Naive 2-ary Tree, b) Naive 4-ary Tree, c) Unified 2-ary & 4-ary Tree, and d) Unified 4-ary & 2-ary Tree, In order to calculate a Total number of bits required for the whole identification process, information on Number of Child Nodes (NCN) for each level of tree and Number of Bits per Query (NBQ) for that specific level is needed. Number of Bits per Level (NBL) can be calculated as follows: 62

89 4.4. UNIFIED Q-ARY TREE NBL = NCN NBQ (4.1) Table 4.6: Calculation of Total memory bits required for two Naive and two Unified Q-ary Trees. TNBL shows the Total Number of Bits required for the specific Q-ary Tree Level 2 2, 4 4, TNBL After calculating the NBL for each level of tree, the Total number of bits (TNBL) required can be found by doing the summation of all NBL. For example, in Figure 4.5a) it can be seen that the tree has twelve levels where all levels, except Level 10, have two child nodes each. For each Level, NBQ increased by 1-bit since this is a Naive 2-ary tree. Thus, NBL for each level are (NCN x NBQ): 2 or (2x1), 4 or (2x2), 6 or (2x3), 8 or (2x4), 10 or (2x5), 12 or (2x6), 14 or (2x7), 16 or (2x8), 18 or (2x9), 40 or (4x10), 22 or (2x11), 24 or (2x12) respectively. After adding all NBL together, the TNBL of 176-bits is as shown in Table 4.6. Table 4.7: Sample Outcomes for 5 tags identification using Naive and Unified approaches Combination 2 2, 4 4, 2 4 Collision Cycles (F) Collision Cycles (S) Total Collision Cycles Idle Cycles (F) Idle Cycles (S) Total Idle Cycles Successful Cycles (F) Successful Cycles (S) Total Successful Cycles Overall Cycles Number of bits (F) Number of bits (S) Total number of bits Table 4.7 shows that both Naive 2-ary tree and Unified 4-ary & 2-ary tree have the same number of Overall cycles. However, the Total number of bits for the two approaches 63

90 CHAPTER 4. DETERMINISTIC ANTI-COLLISION APPROACHES is different. The same goes with Naive 4-ary tree and Unified 2-ary & 4-ary tree where Overall cycles are the same but have a different Total number of bits. As for the impact of EPC data, we can see that when EPC IDs are identical (bit 1-8), the 2-ary tree works better since it uses less Number of bits than the 4-ary tree. This difference cannot be seen without calculating a proper Total number of bits because for the F, both 2-ary and 4-ary trees have the same number of Collision cycles and Idle cycles. However, for each of these cycles, different Number of bits is used for querying, thus 4-ary tree uses more bits than 2-ary tree. For S, 4-ary tree uses less Number of bits than 2-ary tree since the number of Collision cycles happened more in 2-ary tree. Although a 4-ary tree produces more Idle cycles than 2-ary tree in the S, it still produces less total number of Collision cycles and Idle cycles. We can now assume that for Identical bits of EPC, lower level tree (2-ary) can perform better than the higher level (4-ary) and for Unique bits of EPC, a higher level tree is more suitable Experimental Evaluation In order to show the significance of our proposed Unified Q-ary Tree methods, we conducted three experimental evaluations and compared our methods with existing techniques. There are three major data sets in the experiments. We performed ten runs on each test case and presented the average results Environment To study the proposed Unified Q-ary Tree method, all experiments are performed according to a Crystal warehouse scenario. The experiments are assumed to be set up in a well controlled environment where there is no metal or water nearby. We randomly generate all data sets with assumptions that a UHF RFID reader is used and mounted on a dock door at the end of a conveyor belt. Passive RFID tags are attached to each case of crystal ware. Each pallet of wine glasses, plates, and bowls are moved along this conveyor belt. At this stage, we assume that all pallets move in and out at the same time to an interrogation zone, and no arriving tag or leaving tags are present during each identification round Experiment One Data sets For the first experiment, the impact of different number of tags in an interrogation zone is examined. The aim of this experiment is to find the best and the worst performance of Q-ary tree for specific set of tags; therefore, only four Naive Q-ary trees are examined. There are four test cases used in this experiment: Test case A: 2 pallets, 24 cases each, total 48 tags Test case B: 4 pallets, 24 cases each, total 96 tags 64

91 4.4. UNIFIED Q-ARY TREE Figure 4.6: Level-Packaging: a) a case with 6 glasses, and b) a pallet with 27 cases Test case C: 6 pallets, 24 cases each, total 144 tags Test case D: 8 pallets, 24 cases each, total 192 tags Experiment Two Data sets In experiment two, the impact of Separating Point for a specific set of tags is examined. The aim of this experiment is to find the best and the worst performance of the Naive and Unified Q-ary Tree under different Separating Point. Thus, number of tags in an interrogation zone is fixed to 192 tags (8 pallets of 24 cases each). There are three test cases used in this experiment: Test case A: F = 36 bits and S = 60 bits Test case B: F = 60 bits and S = 36 bits Test case C: F = 88 bits and S = 8 bits Experiment Three Data sets After we identified the best Q-ary trees from the two experiments, the aim of the third experiment is to compare the performance of the best Unified Q-ary Tree versus the Naive Q-ary Tree. Results presented in this experiment are related to the Unique Item-Level scenario mentioned earlier in Section 4.2. For the data set, there are 81 tags/epc used in this experiment. Each tag contains 60 Identical bits for F and 36 Unique bits for S. Each pallet contains 27 tags (See Figure 4.6) and three pallets are assumed to be visible to the reader each time. We applied two Naive approaches, 2-ary and 4-ary trees, to the data set with no partition. On the other hand, we applied Unified approaches to the data set using x = 60 and y = 36 based on the nature of Unique Item-Level scenario where the first 60-bits are identical. 65

92 CHAPTER 4. DETERMINISTIC ANTI-COLLISION APPROACHES Figure 4.7: Performances of Naive Q-ary Trees on different set of tags Figure 4.6 displays a level-packaging, where each case contains 6 glasses and each pallet contains 27 cases. Each case has different Serial Number and same Object Class if in the same pallet. For our experiment, three of these pallets will be visible to the reader attached to the dock s door next to the conveyor belt Results This section presents the results and performance measurement of Unified Q-ary Tree. These results are displayed as follows: Experiment One Results Based on the experiment simulation, Figure 4.7 shows the result of four Naive Q-ary Trees: 2-ary, 4-ary, 8-ary, and 16-ary, using four different data sets with specific number of tags. From Figure 4.7, we can see that the Naive 4-ary tree requires the least Total memory bits, while the Naive 16-ary tree requires the most, regardless of number of tags within an interrogation zone. The Total memory bits increased, providing the increment of number of tags although the best performance tree is still 4-ary tree. It can now be concluded that the best performing tree out of the four Q-ary trees is 4-ary tree, while number of tags within the interrogation zone have no impact on the performance. 66

93 4.4. UNIFIED Q-ARY TREE Experiment Two Results Table 4.8 shows all results in Total memory bits required for the Naive and Unified Q-ary trees. According to Table 4.8 and Figure 4.8, we can see that when the Separating Point is between bit 36 and 37 ( F = 36, S = 60), a Unified 2-ary & 4-ary tree requires the least Total memory bits, while the Naive 16-ary requires the most. This supports the result from experiment one where the Naive 16-ary tree has the worst performance, and the Naive 4-ary and the Naive 2-ary tree perform best respectively. Table 4.8: This Table shows Total Memory Bits required for each Q-ary Tree for 192 tags set identification F=36, S=60 F=60, S=36 F=88, S=8 Naive 2-ary Naive 4-ary Naive 8-ary Naive 16-ary ary & 4-ary ary & 8-ary ary & 16-ary ary & 2-ary ary & 8-ary ary & 16-ary ary & 2-ary ary & 4-ary ary & 16-ary ary & 2-ary ary & 4-ary ary & 8-ary When the two trees, 2-ary and 4-ary are combined (Figure 4.8 Case e), the Total memory bits required are slightly reduced from the Naive 4-ary and greatly reduced from the Naive 2-ary tree. However, when combining and applying 4-ary tree on the F and 2-ary tree on the S (Figure 4.8 Case h), the Total memory bits are increased from the Naive 2-ary and the Naive 4-ary tree. This happens because F mostly engage Worst-Case Splitting (Referred to 4.3), where a higher level tree produced more cycles in each level than lower level tree. Thus, when applying 4-ary tree on the F, four nodes are produced instead of two as in 2-ary tree; and three of the four nodes are Idle cycles which are waste of resources and increase Number of bits required. On the contrary, by applying 4-ary tree on the S, Number of bits required are less than 2-ary tree, which result in less Total memory bits needed when combining 2-ary and 4-ary tree together. The results of other Unified Q-ary Trees also show improvement to one of their Naive methods. For example, Figure 4.8 Case k: 8-ary & 2-ary tree (82,332) outperformed the Naive 8-ary tree (87,636) but did not outperformed the Naive 2-ary tree (81,792). The same outcome also applied to other trees (Case f, g, i-p), which demonstrated that our proposed Unifed Q-ary Trees contain significant improvement compared with the existing anti-collision algorithms. 67

CHAPTER 4. DETERMINISTIC ANTI-COLLISION APPROACHES Figure 4.8: Performances of sixteen combination of Q-ary Trees (4 Naive and 12 Unified) where F = 36 and S = 60 Table 4.

94 CHAPTER 4. DETERMINISTIC ANTI-COLLISION APPROACHES Figure 4.8: Performances of sixteen combination of Q-ary Trees (4 Naive and 12 Unified) where F = 36 and S = 60 Table 4.8 also shows the performances of both Naive Q-ary Trees and Unified Q-ary Trees where F = 60 and S = 36. It can be seen that the Naive 4-ary tree requires the least Total memory bits out of the sixteen approaches. Both Unified 2-ary & 4-ary and 4-ary & 2-ary did not outperformed the Naive 4-ary since the Separating Point is different between Test case A and Test case B. According to the test case applied, we know that all data have the same Header and GMN, which are the first 36-bits of EPC only. OC is different for each pallet, thus eight pallets requires eight different OC (Bit 37 to Bit 60). The Separating Point of these results is between bit 60 and 61, which already pass the OC. Therefore, a 2-ary tree requires higher Number of bits than 4-ary tree for F, resulting in higher Total memory bits required when combining the two trees together. The performances of both Naive Q-ary Trees and Unified Q-ary Trees, where F = 88 and S = 8, are displayed in Table 4.8. The result shows that the Unified 2-ary & 4-ary tree requires the least Total memory bits, while a Unified 16-ary & 2-ary tree requires the most. This is because after tags spliced into eight different pallets, the remaining bits of EPC became identical again until almost the last few bits. Therefore, 2-ary tree requires less Number of bits than 4-ary tree for F, resulting in less Total memory bits required for the Unified 2-ary & 4-ary tree. From experimental evaluation, we can now summarise that a Unified 2-ary & 4-ary Tree performed the best overall by reducing the Total memory bits required, where it 68

95 4.4. UNIFIED Q-ARY TREE Figure 4.9: Results of two Naive approaches (2-ary, 4-ary) and two Unified approaches (2-ary & 4-ary, 4-ary & 2-ary) for number of Idle cycles, Collision cycles, Successful cycles, and Overall cycles outperformed both of its Naive methods, which means that identification time can be minimised Experiment Three Results Figure 4.9 shows the average results, from ten runs, on all four combinations: Naive 2-ary, Unified 2-ary & 4-ary, Unified 4-ary & 2-ary, and Naive 4-ary tree. From the figure, we can see that the Naive 4-ary tree produced the most Idle cycles while the Naive 2-ary tree produced the least. In contrast, the Naive 2-ary tree produced the most Collision cycles while the Naive 4-ary tree produced the least. Both Naive 2-ary and Unified 4-ary & 2-ary have the same total number of cycles, which corroborate our methodology. In addition, the total number of cycles for Naive 4-ary tree and Unified 2-ary & 4-ary tree are also equal. The total number of cycles can, at one point, clarify the performance of all four methods. We notice that both Naive 4-ary tree and Unified 2-ary & 4-ary tree have less total cycles than 2-ary and 4-ary & 2-ary. This means that the first two methods will use less Number of bits in querying for all 81 tags than the other two. However, without looking into the actual results of Number of bits, we still cannot conclude which of the two methods will achieve the least identification time for querying. Based on Figure 4.9, we are now aware that Successful cycles of all four methods are all equal to 81, which means that all tags in the interrogation zone are 100 percent identified. 69

96 CHAPTER 4. DETERMINISTIC ANTI-COLLISION APPROACHES Figure 4.10: Results of two Naive approaches (2-ary, 4-ary) and two Unified approaches (2-ary & 4-ary, 4-ary & 2-ary) for Number of bits queried for Idle cycles, Collision cycles, Successful cycles, and Overall cycles We can also see that all 81 tags were recognised at the later stages, where all bits (bit no ) are unique. As for Identical bits of Idle cycles and Collision cycles, the sum of Idle cycles and Collision cycles have an outcome of 120 cycles, which means that both methods of 2-ary or 4-ary tree have no impact in the sense of cycles count. However, as mentioned earlier, we need to calculate the actual Number of bits in order to clarify the difference of the performance of both methods. The next Figure (Figure 4.10) shows the Number of bits for Idle cycles, Collision cycles, Successful cycles, and Overall cycles, of each method. Figure 4.10 shows all the actual bits for all queries that occur during tags identification. It can be seen that the Unified 2-ary & 4-ary tree has the lowest Number of bits queried for entire identification process. This verify our theory that by using a lower level tree for Identical bits of EPC and higher level tree for Unique bits of EPC, Number of bits queried can be minimised and identification process can be accelerated. There is not much difference in results but we can assume that as the number of tags in an interrogation zone increases, and when our proposed Joined Q-ary Tree (Section 4.5) is applied instead of the Unified Q-ary Tree, we will be able to see more differences in the outcome. For Identical bits of EPC, there is a slight difference between the Number of bits queried by the four methods. While Figure 4.9 shows that there is no difference between total number of cycles for Identical bits for all four methods, we can see clearly that the Total 70

97 4.4. UNIFIED Q-ARY TREE Figure 4.11: Performance Analysis of 2-ary Tree vs. 4-ary Tree on Unique bits of EPC, Bit 61-68, until all tags are identified. Results of Overall cycles are displayed number of bits is different for each case in Figure This is because each query inquired each time issues different Number of bits. For example, 4-ary tree issues two extra bits from the last query (from the parent node), while 2-ary tree only append one extra bit to the last query. The Unified 2-ary & 4-ary tree performed the best overall and required 60 bits less than the Naive 4-ary tree, and 924 bits less than the Naive 2-ary tree. In contrast, the Unified 4-ary & 2-ary tree performed the worst out of all four methods. This is because a higher level tree was used at the earlier stages of identification where all bits are identical. This means that more than 75 percent of the queries were Idle cycles which are waste of resources (See Figure ary & 2-ary; Idle cycles:collision cycles = Ratio of 3:1 or 75%:25%). By using 2-ary tree instead of 4-ary tree for Identical bits, 60 bits of queries were reduced (3720 minus 3660). For Unique bits of EPC, Number of bits query rises rapidly compared with Identical bits. Figure 4.10 shows that, by using 4-ary tree for Unique bits of EPC, Number of bits were slightly reduced (see Total bits queried for unique bits). The performance of each method on Unique bits of EPC will be specified in detail in Figure 4.11 and Table 4.9. Figure 4.11 and Table 4.9 show the number of Idle cycles, Collision cycles, Successful cycles and Overall cycles produced in each query round. We can see that at bit to bit 65-66, the difference between Overall cycles of 2-ary and 4-ary tree grows. After bit 67-68, there is not much difference between the two methods. From bit to bit 79-80, there are no Successful cycles for both methods; thus, there are no differences for their 71

98 CHAPTER 4. DETERMINISTIC ANTI-COLLISION APPROACHES Table 4.9: Performance Analysis of 2-ary Tree vs. 4-ary Tree on Unique bits of EPC, Bit 61-68, until all tags are identified Bit 61, 62 Bit 63, 64 Bit 65, 66 Bit 67, 68 2-ary 4-ary 2-ary 4-ary 2-ary 4-ary 2-ary 4-ary Idle Cycles Collision Cycles Successful Cycles Overall Cycles Bit 69, 70 Bit 71, 72 Bit 73, 74 Bit 75, 76 2-ary 4-ary 2-ary 4-ary 2-ary 4-ary 2-ary 4-ary Idle Cycles Collision Cycles Successful Cycles Overall Cycles Bit 77, 78 Bit 79, 80 Bit 81, 82 2-ary 4-ary 2-ary 4-ary 2-ary 4-ary Idle Cycles Collision Cycles Successful Cycles Overall Cycles Overall cycles. We can now assume that at bit to bit 71-72, the EPC are similar but not identical, which results in the unstable change in number of Overall cycles. On the other hand, at bit to bit 79-80, we can assume that all bits become identical again, resulting in no change in Overall cycles. The number of collided tags at bit to bit are exactly two, since the ratio of Idle cycles to Collision cycles is 1:1 for 2-ary tree and 3:1 for 4-ary tree respectively. Finally, all tags were identified at bit 81-82, resulting in the same number of Overall cycles for both 2-ary tree and 4-ary tree. From the experiments, we can now conclude that by using a lower level 2-ary tree for Identical bits of EPC, and by using a higher level 4-ary tree for Unique bits of EPC, the Total number of bits for querying can be decreased. By reducing the Total number of bits, identification time for each round can be minimised. 4.5 Joined Q-ary Tree The Joined Q-ary Tree employs the right combination of Q-ary trees for each specific scenario. The joined Q-ary Tree adaptively adjust its tree branches to suit EPC pattern rather than only split once as in the Unified Q-ary Tree. This procedure will further reduce accumulative bits from the reader s queries and improve the robustness of the overall identification process. The Joined approach is a combined Q-ary trees, specifically 2-ary tree and 4-ary tree, which have been identified to be the best Q-ary trees in our previous researches (See experiment on Unified Q-ary Tree: Section 4.4.4). The Joined approach will be applied 72

99 4.5. JOINED Q-ARY TREE on each collided tags EPC, which will be split using every 1 or 2-bits of tag ID for the first few queries; and then at one point, every 1 or 2-bits will be queried. In order to optimise the performance of Joined Q-ary Tree, the right Separating Point (SP) between the two Q-ary trees needs to be configured. The objective of Joined Q-ary Tree is to reduce the Bits Length queried by a reader so that identification time can be minimised. In this section, we will investigate and compare the Naive Q-ary Tree approach and our newly proposed Joined Q-ary Tree. Figure 4.12: A sample of: a) a Naive 4-ary Tree, b) a Naive 2-ary Tree, and c) a Joined Q-ary Tree Figure 4.12 shows the example of a) Naive 2-ary, b) Naive 4-ary, and c) Joined Q-ary Tree. Joined Q-ary Tree bonded both 2-ary and 4-ary trees together and applied to specific bits of EPC, depending on how Identical or Unique they are EPC Bits Prediction and Classification In warehouse distribution environment according to Unique Item-Level and Unique Container-Level Scenarios, it is known that the first 36-bits of EPC (Header and GMN) are definitely identical. However, 24-bits of OC can be both Identical and Unique for all tags, depending on how many pallets existed within one interrogation zone. For example, if there are five pallets of 12 cases each in the interrogation zone, there will be five different OC and sixty unique SN for all sixty items (cases). Since OC involved 24-bits of EPC (allow 16,777,215 unique tags) but only five unique OC is needed, we must calculate a certain number of Unique bits needed in order to apply the right Q-ary tree. This also applies to SN that contains 36-bits of string. Assuming that EPC pattern is used, not all 36-bits of these strings will be Unique. Table 4.10 shows a formal structure for bits classification of GID-96 bits EPC. It can be seen that the Identical bits of EPC always equal to 36-bits for the first 36-bits of EPC. This includes 8-bits of Header and 28-bits of GMN, which are always the same for all tags. For Object Class, 24-bits are available where Unique bits within Object Class (UOC) can 73

100 CHAPTER 4. DETERMINISTIC ANTI-COLLISION APPROACHES be predicted using Equation 4.2. In addition, Unique bits within Serial Number (USN) with 36-bits can also be predicted using the same Equation. Table 4.10: Formal structure of bits classification of EPC GID-96 bits. *UOC is number of Unique bits within Object Class and **USN is number of Unique bits within Serial Number Length Identical Unique Header General Manager Number Object Class UOC UOC* Serial Number USN USN** Our method is executed based on the assumption that the approximate number of tags (pallets, cases) is known, prior to the identification process. This information is needed for Unique bits calculation: UOC, and USN from Table However, in most circumstances, number of tags is usually unknown until the first query is issued by the reader. Therefore, UOC and USN of Joined Q-ary Tree can be initially set to zero and after the first round of identification, these two parameters can be computed. Joined Q-ary Tree adaptively adjusts their tree branches at specific SP. These SP is configured according to Identical bits and Unique bits within an EPC data. In order to calculate the estimated number of Unique bits within an EPC, we need the average number of tags within an interrogation zone, and then to apply the equation below: B = log 2 (N) (4.2) Where N = Number of tags, B = Unique Bits of EPC. Figure 4.13: Joined Q-ary Tree structure for GID-96 bits EPC Figure 4.13 illustrates tag splitting behaviour of massive tag within a warehouse. The first 36-bits of EPC belongs to Header and GMN, therefore 2-ary tree is applied to these bits since it is simplified to be the most suitable tree for Identical bits of EPC. UOC and 74

101 4.5. JOINED Q-ARY TREE USN on the other hand, can be split using 4-ary tree since it is proven as the most suitable tree for Unique bits of EPC Unique Bits Computation To demonstrate a calculation of Unique bits of EPC, we examine a Massive tag movement of 720 tags within 12 pallets (OC). By using Equation 4.2, UOC and USN can be calculated as follows: UOC = log 2 (N) = log 10(N) log 10 (2) = log 10(12) log 10 (2) 4 USN = log 2 (N) = log 10(N) log 10 (2) = log 10(60) log 10 (2) 6 Therefore, number of Unique bits required to cover all unique OC is approximately 4-bits and approximately 6-bits for SN. Table 4.11: Sample bits classification of EPC GID-96 bits, where Object Class = 12 and Serial Number = 60 (Total of 720 tags) Length Identical Unique Header GMN Object Class Serial Number Figure 4.14: Joined Q-ary Tree structure for GID-96 bits EPC with 36 Identical bits Header and GMN, 20 Identical bits OC, 4 Unique bits OC, 30 Identical bits SN, and 6 Unique bits SN Table 4.11 shows a sample structure for bits classification of 720 tags with GID-96 bits EPC encoding scheme. Corresponding with Figure 4.14, we can see that the Identical bits of EPC always equal to 36-bits for the first 36-bits of EPC. 2-ary tree is applied to these 75

102 CHAPTER 4. DETERMINISTIC ANTI-COLLISION APPROACHES bits. Identical bits of OC (20-bits) and SN (30-bits) are also spliced by 2-ary tree. UOC (4-bits) and USN (6-bits) on the other hand, can be split using 4-ary tree Tags Splitting We are now initiating a sample comparison between the performance of Naive 2-ary tree, Naive 4-ary tree, and Joined Q-ary Tree. Table 4.12 shows ten sample EPC of 36-bits tags with 24-bits Identical and 8-bits Unique. We assumed that all Identical bits of EPC belong to Header and GMN, while Unique bits of EPC belong to OC and SN. Table 4.12: Sample 36 bits tags with 24 Identical bits and 8 Unique bits Identical bits Unique bits Tag Identification After applying different trees on the sample tag sets, we can see that the Naive 2-ary tree has the best performance on Identical bits of EPC by using smallest number of queries (Bits Length) as shown in Table On the other hand, 4-ary tree performed better on Unique bits than 2-ary tree. Table 4.13: Identification process of 2-ary Tree and 4-ary Tree on Identical bits and Unique bits of EPC Performances on Identical bits Q-ary Idle Collision Successful Total Bits Tree Cycles Cycles Cycles Cycles Length 2-ary ary Performances on Unique bits Q-ary Idle Collision Successful Total Bits Tree Cycles Cycles Cycles Cycles Length 2-ary ary

103 4.5. JOINED Q-ARY TREE Bits Length Calculation In order to calculate a Total Bits Length required for the whole identification process for 2-ary tree, 4-ary tree, and Joined Q-ary Tree, information on Number of Child Nodes (NCN) for each level of tree and Number of Bits per Query (NBQ) for that specific level, is needed. Number of Bits per Level (NBL) can be calculated using Equation 4.1: NBL = NCN NBQ After calculating the NBL for each level of tree, the Total Bits Length (TNBL) required can be found by doing the summation of all NBL. After adding all NBL together, the TNBL of 828-bits, 848-bits, and 728-bits, are shown in Table 4.14 respectively for 2-ary tree, 4-ary tree, and Joined Q-ary Tree. Table 4.14: Calculation of Total Bits Length required for two Naive Q-ary Trees and a Joined Q-ary Tree. TNBL shows the Bits Length required for the specific Naive/Joined Q-ary Tree Level Naive 2-ary Naive 4-ary Joined Q-ary TNBL Table 4.15: Performance Analysis of Naive 2-ary Tree, Naive 4-ary Tree, and Joined Q-ary Tree on set of 10 sample tags Different Q-ary Tree on Sample Tags Q-ary Idle Collision Successful Total Identical Unique Total Tree Cycles Cycles Cycles Cycles Bits Bits Bits 2-ary ary Joined Table 4.15 shows the overall calculation of Bits Length queried on Identical bit and Unique bits of EPC. We can see that the Joined Q-ary Tree required the least Total Bits Length compared with the two Naive Q-ary Trees. Joined Q-ary Tree reduced Bits Length by almost 15 percent compared with Naive 4-ary tree and by 12 percent compared with Naive 2-ary tree Experimental Evaluation In order to show the significance of our proposed Joined Q-ary Tree methods, we conducted two experimental evaluations and compared our methods with existing techniques. There 77

104 CHAPTER 4. DETERMINISTIC ANTI-COLLISION APPROACHES are two major data sets in the experiments. We performed ten runs on each test case and presented the average results Environment To study the proposed Joined Q-ary Tree method, all experiments were performed according to a Crystal warehouse scenario and were assumed to be under the same environment as for the experimental evaluation of the Unified Q-ary Tree (Section 4.4.3) Experiment One Data sets In the first experiment, we conducted an experiment using three different tag sets: 288 tags, 576 tags, and 864 tags. The impact of different number of tags in an interrogation zone and performances of Joined Q-ary Tree approach is to be evaluated. Data sets using EPC pattern from Table 4.16 are applied: Table 4.16: Chosen EPC Pattern for Experiment One EPC Pattern H GMN 104,426,055 OC [9,872,273-9,872,308] SN [26,292,755,245-26,292,755,268] There are three test cases used in this experiment: Test case A: 12 pallets, 24 cases each, total 288 tags Test case B: 24 pallets, 24 cases each, total 576 tags Test case C: 36 pallets, 24 cases each, total 864 tags For the joined Q-ary Tree approach, the SP of each test case must be calculated using Equation 4.2. Theoretical Bits Prediction Assuming that the existence of tags are known before identification process, by using Equation 4.2, UOC and USN of Test case A, B, and C can be calculated as follows: Test case A: UOC = log 2 (12) 4, USN = log 2 (24) 5 78

105 4.5. JOINED Q-ARY TREE Test case B: UOC = log 2 (24) 5, USN = log 2 (24) 5 Test case C: UOC = log 2 (36) 6, USN = log 2 (24) 5 Therefore, number of Unique bits required to cover all unique OC and SN are as shown in Table Table 4.17: Identical and Unique Bits classification of EPC GID-96 bits for Experiment one - Test case A, B, and C. I = Identical bits, U = Unique bits 288 tags 576 tags 864 tags I U I U I U H GMN OC SN Actual Separating Point Configuration After theoretical bits estimation, we assumed that the actual encoding of Unique bits may be 1 to 2-bits longer than predicted. Therefore, we added 2-bits to the predicted Unique bits of each data set. If the predicted bits added up as odd number, one more bit is further attached. For example, UOC of data set two (576 tags) is predicted to be 5-bits long. By affixing additional 2-bits, total of 7-bits are applied to UOC. However, this added up as odd number, therefore, one more bit is further attached (Total 8-bits). Table 4.18 shows an actual SP for each data set. At specific SP, the Joined Q-ary Tree will adaptively change its branch to 2-ary or 4-ary. Since 2-ary tree is applied from SP1 to SP3, the Joined Q-ary Tree only needs to adjust its branch at SP4, SP5, and SP6. Table 4.18: Actual Separating Point for Experiment one - Test case A, B, and C. At a specific SP, Joined Q-ary Tree will adjust its branch to either 2-ary or 4-ary Tree 288 tags 576 tags 864 tags SP H GMN OC(I) OC(U) SN(I) SN(U)

106 CHAPTER 4. DETERMINISTIC ANTI-COLLISION APPROACHES Experiment Two Data sets We conducted a second experiment using three different tag sets: 288 tags, 576 tags, and 864 tags. The impact of different encoding schemes and performances of Joined Q-ary Tree approach is to be evaluated. Data sets using EPC pattern from Table 4.19 are applied: Table 4.19: Chosen EPC Pattern for Experiment Two GID-96 bits EPC Pattern H GMN 104,426,055 OC [9,872,273-9,872,308] SN [26,292,755,245-26,292,755,268] SGTIN-96 bits EPC Pattern H FV 011 PT 101 CP 9,352,006 IR [595, ,949] SN [121,705,236, ,705,236,389] GIAI-96 bits EPC Pattern H FV 011 PT 011 CP 581,162,659 IAR [815,223,149,060, ,223,149,061,627] There are three test cases used in this experiment: Test case A: 12 pallets, 24 cases each, total 288 tags Test case B: 24 pallets, 24 cases each, total 576 tags Test case C: 36 pallets, 24 cases each, total 864 tags Results This section presents the results and performance measurement of Joined Q-ary Tree. These results are displayed as follows: Experiment One Results Based on the experiment results shown in Figure 4.15, the Joined Q-ary Tree always performed the best out of the three approaches considered, while the 4-ary tree has the worst performance, regardless of number of tags within an interrogation zone. This corresponds with our methodology that if the Separating Point and the Q-ary trees are applied correctly to the EPC data, the optimal results can be achieved by the Joined Q-ary Tree. 80

107 4.5. JOINED Q-ARY TREE Figure 4.15: Performances comparison (GID-96) between Naive approaches and Joined Q-ary approach Table 4.20 demonstrates that 2-ary tree has the better performance than 4-ary tree by about 1 percent, while Joined Q-ary Tree s performance is approximately 12 percent better than the 2-ary tree. The 4-ary tree has the worst performance out of the three approaches considered. Table 4.20: Percentage improvement of the proposed Joined Q-ary Tree versus existing Naive 2-ary (N2) and Naive 4-ary (N4) approaches 2-ary Tree 4-ary Tree Joined Q-ary Improved Improved Improved Improved Improved Improved from N2 from N4 from N2 from N4 from N2 from N Figure 4.16 illustrates the improvement in percentage of Joined Q-ary Tree compared with the 2-ary tree and the 4-ary tree. The percentage of improvement increases more slowly once the number of tags within the interrogation zone gets higher. To further analyse and compare performances of Naive Q-ary approaches and Joined Q-ary approach, Table 4.21 shows Accumulative Bits Length of each approach until all tags were identified. We can see that the number of Bits Length accumulates faster when the identification reached SP4. This is when EPC data became more Unique and larger Bits Length issued by reader were required. All three approaches have the same pattern of increment, where number of tags in an interrogation zone has no impact on the 81

CHAPTER 4. DETERMINISTIC ANTI-COLLISION APPROACHES Figure 4.16: Percentage of improvement (GID-96) of Joined Q-ary Tree compared with Naive 2-ary Tree and Naive 4-ary Tree Table 4.

108 CHAPTER 4. DETERMINISTIC ANTI-COLLISION APPROACHES Figure 4.16: Percentage of improvement (GID-96) of Joined Q-ary Tree compared with Naive 2-ary Tree and Naive 4-ary Tree Table 4.21: Accumulative Bits Length of three approaches: Naive 2-ary Tree, Naive 4-ary Tree, and Joined Q-ary Tree 2-ary Tree 288 tags 576 tags 864 tags SP SP SP SP SP SP ary Tree 288 tags 576 tags 864 tags SP SP SP SP SP SP Joined Q-ary Tree 288 tags 576 tags 864 tags SP SP SP SP SP SP

109 4.5. JOINED Q-ARY TREE Figure 4.17: Accumulative Bits Length (GID-96) of three approaches: a) Naive 2-ary Tree, b) Naive 4-ary Tree, and c) Joined Q-ary Tree on different tag sets performances when EPC data were identical. This pattern can be seen in Figure 4.17, where Accumulative Bits Length of the Joined Q-ary Tree approach is displayed. There are no or little difference between SP1 to SP4 on all data sets, however, number of Bits Length started to increase rapidly once it reaches SP5 and SP6. Out of the three approaches, the Joined Q-ary Tree has the slowest incremental rate between SP4 and SP6 for all data sets. The incremental Bits Length also slows down once number of tags within the interrogation zone gets bigger. Therefore, the performance of Joined Q-ary Tree will increase for larger set of tags compared with the Naive approaches. 83

110 CHAPTER 4. DETERMINISTIC ANTI-COLLISION APPROACHES Since the Joined Q-ary Tree accumulated least Bits Length out of the three approaches, it can be concluded that the best performing approach, using GID-96 bits encoding scheme, is the Joined Q-ary Tree. Also, the number of tags within the interrogation zone has no impact on Identical bits of EPC data. Additionally, by applying the right Q-ary tree on specific bits of EPC, performance of Joined Q-ary Tree is improved Experiment Two Results We observed three types of EPC encoding scheme in this experiment. Table 4.22 and Figure 4.18 demonstrate results of the Joined Q-ary Tree using the three encoding schemes on different number of tags versus the two Naive Q-ary Trees. From both Figure and Table, it can be seen that the Joined Q-ary Tree always performed the best out of the three approaches considered, while the 4-ary tree has the worst performance for GID-96 bits (Figure 4.18a) and SGTIN-96 bits (Figure 4.18b) encoding schemes. For GIAI-96 bits encoding (Figure 4.18c), the Naive 2-ary tree has the worst performance compared with the Naive 4-ary tree and the Joined Q-ary Tree. Table 4.22: Number of bits length of three approaches using different Encoding Scheme: a) GID-96 bits, b) SGTIN-96 bits, and c) GIAI 96 bits GID-96 bits 288 tags 576 tags 864 tags 2-ary ary Joined SGTIN-96 bits 288 tags 576 tags 864 tags 2-ary ary Joined GIAI-96 bits 288 tags 576 tags 864 tags 2-ary ary Joined The differences in the results for GIAI-96 bits against the other two encoding schemes can be explained by the number of fields involved in the EPC pattern. From all 96 bits of the EPC data, the GIAI-96 bits scheme only has the IAR (Item-Level) field that involved both Identical and Unique bits of EPC. The other four fields engaged only Identical bits (See Table 4.19). This is different from the other two encoding schemes where there are two partitions that engaged Unique bits of EPC data; 1) OC and 2) SN in GID-96 bits and 1) IR and 2) SN in SGTIN-96 bits. Therefore, the Unique part of EPC in GIAI-96 bits encoding scheme is in the latter stage of EPC data, which results in higher number of queries being issued by both 2-ary and 4-ary trees. Since the 4-ary tree performs better in Unique bits, it needs less queries for IAR field 84

111 4.5. JOINED Q-ARY TREE Figure 4.18: Performances comparison between Naive approaches and Joined Q-ary approach using different Encoding Scheme: a) GID-96 bits, b) SGTIN-96 bits, and c) GIAI 96 bits compared with the 2-ary tree. The nature of GIAI-96 bits encoding scheme also resulted in minimal number of queries issued, compared with the other two encoding schemes, as it involves less partitions. Nevertheless, the Joined Q-ary Tree still outperformed both Naive 2-ary and Naive 4-ary trees regardless of the encoding scheme used. This is because the Joined Q-ary Tree adapts the best tree to suit the circumstance of each part of the EPC data, which resulted in the lowest number of queries issued by the RFID readers. 85

112 CHAPTER 4. DETERMINISTIC ANTI-COLLISION APPROACHES Table 4.23: Percentage of improvement of Joined Q-ary Tree versus two Naive approaches using different Encoding Scheme: a) GID-96 bits, b) SGTIN-96 bits, and c) GIAI 96 bits GID-96 bits 288 tags 576 tags 864 tags Improved from 2-ary Improved from 4-ary SGTIN-96 bits 288 tags 576 tags 864 tags Improved from 2-ary Improved from 4-ary GIAI-96 bits 288 tags 576 tags 864 tags Improved from 2-ary Improved from 4-ary Table 4.23 demonstrates that the Joined Q-ary Tree performs better than the 2-ary tree by about 10 to 13 percent, and achieve around 8 to 13 percent better than the 4-ary tree. These all depend on different number of tags within the interrogation zone and the chosen encoding scheme. Figure 4.19 illustrates the improvement in percentage of Joined Q-ary Tree compared with the 2-ary tree and the 4-ary tree. The percentage of improvement increases more slowly for GID-96 bits and SGTIN-96 bits schemes, once the number of tags within the interrogation zone gets higher. On the other hand, for GIAI-96 bits encoding, there is no decrease in system efficiency in the Joined Q-ary Tree, but the performance of the naive trees increase once the number of tags increase. Since the Joined Q-ary Tree accumulated least Bits Length out of the three approaches, regardless of the encoding scheme and the number of tags within the interrogation zone, it can now be concluded that the best performing approach is Joined Q-ary Tree. 4.6 Overall Analysis A total of five experiments were conducted for the tree-based anti-collision approaches. The first three experimental evaluations were to verify the performance and capability of our proposed Unified Q-ary Tree, and the remaining two experiments were to proof the concept of the Joined Q-ary Tree. From experiment one to three, we determined that out of twelve combinations of Unified Q-ary Trees, the Unified 2-ary & 4-ary tree performed the best overall, in terms of system robustness that preserves memories usage during identification process. We then verified that by using a 2-ary tree for Identical bits of EPC, and by using a 4-ary tree for Unique bits of EPC, the Total number of bits for querying can be decreased. In addition to the first three experiments, the fourth and fifth demonstrate that the best performing approach is the Joined Q-ary Tree, using GID-96 bits encoding scheme. The results acquired have shown that the Joined Q-ary Tree has far more superior performance compared with our proposed Unified Q-ary Tree and existing Naive Q-ary Trees. Moreover, 86

113 4.6. OVERALL ANALYSIS Figure 4.19: Percentage of improvement of Joined Q-ary Tree compared with Naive 2-ary Tree and Naive 4-ary Tree, using different Encoding Scheme: a) GID-96 bits, b) SGTIN-96 bits, and c) GIAI 96 bits we also discovered that the Joined Q-ary Tree achieves the best performance, regardless of the type of encoding scheme. From the analysis of all experiments, we recognised certain properties of importance for tree-based anti-collision methods, which are: 1) similarity of EPC pattern, 2) number of tags within one group of the EPC pattern, and 3) overall number of tags within the interrogation zone. 87

114 CHAPTER 4. DETERMINISTIC ANTI-COLLISION APPROACHES 4.7 Summary In this chapter, we have investigated the problems on existing deterministic anti-collision schemes, and we proposed two new tree-based anti-collision methods in order to eliminate shortcomings of existing techniques. The main contributions and findings of this Chapter are as follows: We have proposed a Unified Q-ary Tree (Pupunwiwat and Stantic, 2009a), (Pupunwiwat and Stantic, 2009b), which is a combination of two Q-ary trees; and we identified the best Q-ary tree for specific circumstances. We found that a 2-ary tree suits best when the EPC bits are identical, and the use of 4-ary tree can be optimised when the EPC bits are unique. The best combination of 2-ary and 4-ary trees are then used for construction of a Joined Q-ary Tree. Based on findings from the Unified Q-ary Tree experiment, we then proposed a Joined Q-ary Tree (Pupunwiwat and Stantic, 2010c) to further minimised the total memories usage during the identification process. We discover that the Joined Q-ary Tree performed the best compared with existing tree-based anti-collision techniques, regardless of number of tags within the reader zone and the encoding scheme used. We have confirmed that the similarity of EPC pattern, the number of tags within one group of the EPC pattern, and the overall number of tags within the interrogation zone, have impacted on the performance of any tree-based anti-collision schemes. Nevertheless, the best performing technique, in terms of memory usage and robustness of the RFID system, are our proposed deterministic anti-collision techniques. 88

115 5 Probabilistic Anti-Collision Approaches In this chapter, we tackle issues of existing probabilistic anti-collision schemes, such as the amount of slots and frames produced during each identification process, and the performance efficiency. Firstly, we introduce a Precise Tag Estimation Scheme (PTES) (Pupunwiwat and Stantic, 2010a), (Pupunwiwat and Stantic, 2010b) to minimise the number of slots and frames queried by the RFID reader, and to maximise the system efficiency. Secondly, we introduce the Probabilistic Cluster-Based Technique (PCT) (Pupunwiwat and Stantic, 2010d) anti-collision method to improve the performance of tag recognition process and provide a sufficient performance over existing methodologies. Finally, the remaining of this chapter comprise the mathematic fundamental for probabilistic anticollision schemes, the foundations of the proposed PTES and PCT methods, and the experimental evaluation. 5.1 Mathematic Fundamental for ALOHA-based Tag Estimation In the Framed-Slotted ALOHA based probabilistic scheme, to estimate the number of present tags, Binomial distribution is a good fundamental method. For a given initial Q in a frame with F slots and n tags, the expected value of the number of slots with occupancy number x is as follows: a x = n C x n( 1 F )x (1 1 F )n x 89

116 CHAPTER 5. PROBABILISTIC ANTI-COLLISION APPROACHES Therefore, the expected number of Empty slot e, Successful slot s, and Collision slot c is given by the following equations: e = a 0 = F (1 1 F )n s = a 1 = n(1 1 F )n 1 c = a k = F a 0 a 1 Thus, the system efficiency (E) is defined as the ratio between the number of Successful slot and the frame-size, as per the following equations: E = s F = n(1 1 F )n 1 F = n 1 F (1 1 F )n 1 It has been proven that the highest efficiency can be obtained if the frame-size F is equal to the number of tags n, provided that all slots have the same fixed length: F (optimal) = n Therefore, we make the assumption that by keeping the number of tags close to the available frame-size, the optimal performance efficiency can be obtained. According to literatures, it is possible to achieve the theoretically optimal efficiency of 36.8 percent in ALOHA-based systems. 5.2 Precise Tag Estimation Scheme A frame-size prediction stage is one of the most crucial processes that determine the performance of the probabilistic anti-collision technique. In order to overcome shortcomings of existing methods for frame-size prediction, we propose a Precise Tag Estimation Scheme (PTES) that is compatible with any ALOHA-based anti-collision protocols. The aims of PTES are to obtain optimal tunable parameters that produce minimum number of frames and slots; and to find the impact of collision slots and empty slots toward Backlog estimation. After obtaining initial results for PTES, the optimal parameters found will be incorporated within our proposed PCT, in order to further improve the performance efficiency. This section will describe the newly proposed PTES; the specific requirements for tag estimation; initial Q value; suggest frame-size; and sample tag estimation and allocation. 90

117 5.2. PRECISE TAG ESTIMATION SCHEME Slot Observation and Initial Q Value For general probabilistic anti-collision algorithm, the reader picks tag within an interrogation zone by the command Select ; then issues Query, which contains a Q parameter to specify the frame-size, [F = 2 Q - 1]. For our PTES methodology, we have chosen initial Q value to be any number between 1 and 15, giving enough maximum number of slots of per frame. After the first round of identification, collision slots and empty slots will be observed and used, to estimate remaining number of tags. After the number of tags has been estimated, frame-size for the next identification round can be configured in accordance to the suggested frame-size threshold Suggested Threshold for Frame-Size The suggested frame-size threshold to be used in our methodology is set according to estimated number of tags. For example, if the estimated number of tags is 100 tags, the suggested frame-size would have a Q value of 7. Since the frame-size is calculated by 2 Q - 1, the frame-size where Q = 7 will allow at most 128 tags (boundary 0 to 2 7-1) to be identified. Therefore, if the estimated number of tags is between 65 and 128 tags, the suggested initial Q would equal to 7. Table 5.1 shows Minimum and Maximum number of tags allowed per suggested framesize, and Minimum and Maximum boundary of random numbers generated per frame-size. Maximum number of tags allowed in each frame-size is calculated by 2 Q and minimum number of tags allowed is calculated by 2 Q The maximum frame-size boundary is calculated by 2 Q - 1, while the minimum frame-size boundary is always 0. The Table only demonstrates up to Q = 15. Table 5.1: Suggested frame-size boundary (B) and minimum and maximum number of tags (NT) for specific estimated number of tags Q = 1 Q = 2 Q = 3 Q = 4 Q = 5 NT B NT B NT B NT B NT B Min Max Q = 6 Q = 7 Q = 8 Q = 9 Q = 10 NT B NT B NT B NT B NT B Min Max Q = 11 Q = 12 Q = 13 Q = 14 Q = 15 NT B NT B NT B NT B NT B Min Max

118 CHAPTER 5. PROBABILISTIC ANTI-COLLISION APPROACHES PTES approaches In this chapter, we propose three tag estimation methods for a Precise Tags Estimation Scheme (PTES). In method one - PTES[C], we use a fixed parameter to calculate collision slot and use a variable to predict empty slot for the next round of identification. In method two - PTES[CE], we use two variables to predict collision slot and empty slot for the new identification round. Finally, in method three - PTES[CCE], we use a fixed parameter to calculate collision slot for the first round of identification, then we use two variables to predict collision slot and empty slot from the second identification round onward. The aims of all PTES methods are to clarify that both collision slots and empty slots have an impact on Backlog prediction; and that more than one variable can be used to predict frame-size for an upcoming round effectively, depending on the chosen Initial Q. The three methods are explained within this sub-section PTES[C] A PTES[C] uses various parameters to predict collision slots for the new identification round. PTES[C] method aims to obtain the optimal parameter in order to calculate and predict the closest number of remaining tags for the next round of identification. We assume that for the current identification round, each collision slot has at least two tags collided. However, it is impossible to distinguish certain number of tags that actually caused the collision. There is exactly one tag per successful slot, therefore, we do not take successful slots into consideration. In addition, an empty slot does not engage any tag. Accordingly, we also do not take empty slots into consideration in this method. PTES[C] focuses on finding optimal parameters to calculate and predict the number of collision slots for the next identification round. The PTES[C] method uses different parameter between 2.0 and 3.0 to predict the number of collision slots. Since a collision slot engages at least two tags, we assume that the parameter for collision slots calculation falls between 2.0 and 3.0 (more than two tags but possibly less than three tags). However, in reality, the number of tags per collision slot can be more than three tags. According to Schoute s method, which is a simple and accurate Backlog estimation technique, the parameter 2.39 for collision slots prediction is used. Therefore, we select our collision slots variable to be between 2.0 and 3.0. Equation 5.1 shows Backlog estimation using variable 2.0 <= V 1 <= 3.0 for collision slots prediction. The variable V 1 is the tunable parameters to predict the number of remaining tags, using information of collision slots from the previous frame. Backlog = (V 1 c) (5.1) 92

119 5.2. PRECISE TAG ESTIMATION SCHEME Where c is the number of Collision Slot; and V 1 is variable between 2.0 and 3.0 with increments of 0.1. Backlog = ( 2.0 c ) Backlog = ( 2.1 c )... Backlog = ( 3.0 c ) Therefore, there are eleven possible optimal V 1 for PTES[C] method. number of slots is rounded-up to the nearest integer. The estimated Algorithm 1 demonstrates the PTES[C] algorithm applied for the number of Backlog estimation, and either keep the current frame-size or re-adjust the frame-size for the the next identification cycle. Input: c = Collision Slots, Backlog Output: Q = F rame Size Adjustment for (Frame-Size prediction procedure) do Backlog = (V 1 * c); while Looking up Suggested Threshold for Frame-Size Table do if Found Matched Q for specific Backlog then Re-adjust Q Value; end end Output Q Adjust Value; end Algorithm 1: PTES(C) Algorithm PTES[CE] Similar to PTES[C] method, the PTES[CE] method aims to obtain the optimal parameter in order to calculate and predict the closest Backlog for the next identification round. We assume that for the current identification round, each collision slot has at least two tags collided. However, we cannot know for sure how many tags actually caused the collision. There is exactly one tag per successful slot, thus, we do not take successful slots into consideration. On the other hand, we assume that empty slots will continuously occur during the next rounds of identification despite the frame-size. Thus, PTES[CE] method is created to find the optimal parameter and to predict the number of remaining tags for upcoming round, using information from both collision slots and empty slots of the current frame. The PTES[CE] method uses any variable between 2.0 and 3.0 to predict the number of collision slots. Variables between 0.1 and 0.9 are also used to predict the number of 93

120 CHAPTER 5. PROBABILISTIC ANTI-COLLISION APPROACHES 2.0c, 0.1e 2.0c, 0.2e 2.0c, 0.3e 2.0c, 0.4e 2.0c, 0.5e 2.0c, 0.6e 2.0c, 0.7e 2.0c, 0.8e 2.0c, 0.9e 2.1c, 0.1e 2.1c, 0.2e 2.1c, 0.3e 2.1c, 0.4e 2.1c, 0.5e 2.1c, 0.6e 2.1c, 0.7e 2.1c, 0.8e 2.1c, 0.9e 2.2c, 0.1e 2.2c, 0.2e 2.2c, 0.3e 2.2c, 0.4e 2.2c, 0.5e 2.2c, 0.6e 2.2c, 0.7e 2.2c, 0.8e 2.2c, 0.9e 2.3c, 0.1e 2.3c, 0.2e 2.3c, 0.3e 2.3c, 0.4e 2.3c, 0.5e 2.3c, 0.6e 2.3c, 0.7e 2.3c, 0.8e 2.3c, 0.9e 2.4c, 0.1e 2.4c, 0.2e 2.4c, 0.3e 2.4c, 0.4e 2.4c, 0.5e 2.4c, 0.6e 2.4c, 0.7e 2.4c, 0.8e 2.4c, 0.9e 2.5c, 0.1e 2.5c, 0.2e 2.5c, 0.3e 2.5c, 0.4e 2.5c, 0.5e 2.5c, 0.6e 2.5c, 0.7e 2.5c, 0.8e 2.5c, 0.9e 2.6c, 0.1e 2.6c, 0.2e 2.6c, 0.3e 2.6c, 0.4e 2.6c, 0.5e 2.6c, 0.6e 2.6c, 0.7e 2.6c, 0.8e 2.6c, 0.9e 2.7c, 0.1e 2.7c, 0.2e 2.7c, 0.3e 2.7c, 0.4e 2.7c, 0.5e 2.7c, 0.6e 2.7c, 0.7e 2.7c, 0.8e 2.7c, 0.9e 2.8c, 0.1e 2.8c, 0.2e 2.8c, 0.3e 2.8c, 0.4e 2.8c, 0.5e 2.8c, 0.6e 2.8c, 0.7e 2.8c, 0.8e 2.8c, 0.9e 2.0c, 0.1e 2.9c, 0.2e 2.9c, 0.3e 2.9c, 0.4e 2.9c, 0.5e 2.9c, 0.6e 2.9c, 0.7e 2.9c, 0.8e 2.9c, 0.9e 3.0c, 0.1e 3.0c, 0.2e 3.0c, 0.3e 3.0c, 0.4e 3.0c, 0.5e 3.0c, 0.6e 3.0c, 0.7e 3.0c, 0.8e 3.0c, 0.9e Figure 5.1: Variable V 1 and V 2 for Collision slot and Empty slot calculation for PTES[CE] method. There are ninety-nine possible combinations of V 1 and V 2, in order to find optimal parameters for c and e prediction empty slots for the upcoming round. Since an empty slot does not engage any tag, we assume that the parameter for empty slots calculation will fall between 0.1 and 0.9 (no more than one tag). Equation 5.2 shows Backlog estimation using variable V 1 for collision slots prediction and variable V 2 for empty slots prediction. Both variables V 1 and V 2 are tunable parameters to predict the number of remaining tags, using information of collision slots and empty slots from the previous frame. Backlog = (V 1 c + V 2 e) (5.2) where c is the number of Collision Slot; e is the number of Empty slot; V 1 is variable between 2.0 and 3.0; and V 2 is variable between 0.1 and 0.9 with increments of 0.1. Backlog = ( 2.0 c e )... Backlog = ( 2.0 c e ) Backlog = ( 2.1 c e ) Backlog = ( 3.0 c e ) Therefore, there are ninety-nine possible optimal V 1 and V 2 variables for this method. The estimated number of slots is rounded-up to the nearest integer. Figure 5.1 shows ninety-nine possible optimal V 1 and V 2 variables for c and e. 94

121 5.2. PRECISE TAG ESTIMATION SCHEME Algorithm 2 demonstrates the PTES[CE] algorithm applied for the number of Backlog estimation; and either keep the current frame-size or re-adjust the frame-size for the the next identification cycle. Input: c = Collision Slots, e = Empty Slots, Backlog Output: Q = F rame Size Adjustment for (Frame-Size prediction procedure) do Backlog = (V 1 * c + V 2 * e); while Looking up Suggested Threshold for Frame-Size Table do if Found Matched Q for specific Backlog then Re-adjust Q Value; end end Output Q Adjust Value; end Algorithm 2: PTES(CE) Algorithm PTES[CCE] The PTES[CCE] method uses parameter 2.0 to predict the number of collision slots after the first round of identification. Parameter 2.0 is chosen according to the assumption that at least two tags collided per collision slot. Since the number of tags is supposedly unknown at the beginning, a simple frame-size prediction using variable 2.0 is chosen for the next Q adjust value. Equation 5.1 with variable V 1 = 2.0 is applied for collision slots prediction. Backlog = (2.0 c) Where c is the number Collision Slot. The PTES[CCE] uses Equation 5.2 to predict the number of collision slots and empty slots from second round onward. Equation 5.2 shows Backlog estimation using variable V 1 for collision slots prediction and variable V 2 for empty slots prediction. Backlog = (V 1 c + V 2 e) where c is the number of Collision Slot; e is the number of Empty slot; V 1 is variable between 2.0 and 3.0; and V 2 is variable between 0.1 and 0.9 with increments of 0.1. Algorithm 3 demonstrates the PTES[CCE] algorithm applied for the number of Backlog estimation; and either keep the current frame-size or re-adjust the frame-size for the the next identification cycle. 95

122 CHAPTER 5. PROBABILISTIC ANTI-COLLISION APPROACHES Input: c = Collision Slots, e = Empty Slots, Backlog, Roundcount Output: Q = F rame Size Adjustment for (Frame-Size prediction procedure) do if Roundcount == 0 then Backlog = (2.0 * c); end else Backlog = (V 1 * c + V 2 * e); end Roundcount = Roundcount + 1; while Looking up Suggested Threshold for Frame-Size Table do if Found Matched Q for specific Backlog then Re-adjust Q Value; end end Output Q Adjust Value; end Algorithm 3: PTES(CCE) Algorithm Table 5.2 displays the comparison between our three proposed PTES methods. From the Table, the PTES[C] only uses variable V 1 for tag estimation while the other two PTES use both V 1 and V 2. The difference between PTES[CE] and PTES[CCE] is in the first round of tag prediction where PTES[CCE] introduces discrete estimation, using only variable V 1 for the first round of identification; then from the second round all procedures are the same as in PTES[CE]. Table 5.2: PTES methods comparison Method Round Variable V 1 Variable V 2 PTES[C] All N/A PTES[CE] All PTES[CCE] First 2.0 N/A Second onward Sample Tag Estimation and Allocation This section describes a sample tag allocation and estimation for all three PTES methods. For instance, there are twenty tags to be identified. However, while performing the probabilistic anti-collision algorithm, the number of tags is supposedly unknown. The initial Q value for this example is set to 4; thus, the number of available slots for the first round of identification is equal to 16 (0 to 2 4-1). Figure 5.2 shows a sample of first round tag allocation, where seven collision slots, four empty slots, and five successful slots occurred. For each collision slot, two or more tags collided while an empty slot engaged no tag. Each successful slot holds exactly one tag per slot. After the first round of tag allocation, PTES equations are applied, in order to find an estimated frame-size for the next round. 96

123 5.2. PRECISE TAG ESTIMATION SCHEME Figure 5.2: A sample first round of tag allocation with Initial Q of 4. Collision slot c = 7, Empty slot e = 4, and Successful slot s = Sample Tag Estimation - PTES[C] After the first round of identification shown in Figure 5.2, we applied PTES[C] method with variable V 1 between 2.0 to 3.0, to estimate collision slots for the upcoming round. The actual remaining tags from this round are fifteen tags. For instance, after applying Equation 5.1 using V 1 = 2.2, number of estimated tags for the next round can be calculated as follows: Backlog = ( ) = 15 Therefore, the estimated number of tags for the next round is equal to fifteen tags. Hence, the new Q adjust is equal to 4 (see Table 5.1 for suggested frame-size). Nevertheless, if different variable V 1 is used, number of estimated tags and the new Q adjust would be different, as shown in Table 5.3. Table 5.3: Sample tag estimation and frame-size (Q) adjustment after the first round of identification, using PTES[C] method Round one (c = 7) Variable (V 1 ) Tag Estimation Q Adjust Subsequent to the first round of identification, according to Table 5.3, the adjustment of Q value for parameters V 1 = is equal to 4, while the Q value for V 1 = is equal to 5. In order to identify all tags within the interrogation zone, PTES[C] with variable V 1 is applied until no more collision occurs. Corresponding to Figure 5.3, we can 97

124 CHAPTER 5. PROBABILISTIC ANTI-COLLISION APPROACHES see that the second round of identification split into two using Q adjust of 4 and 5. Figure 5.3: A sample of Q-adjust in each round of identification until all tags are identified Table 5.4: Sample tag estimation and frame-size (Q) adjustment after the second round of identification, using PTES[C] method Round two (c = 4) Variable (V 1 ) Tag Estimation Q Adjust Round two (c = 3) Variable (V 1 ) Tag Estimation Q Adjust Table 5.4 shows the second round of identification. After the second round, the Q adjust for variable V 1 = 2.0, 2.1, 2.4, 2.5, 2.6, 2.7, and 2.8 is equal to 3, while the Q adjust for variable V 1 = 2.2, 2.3, 2.9, and 3.0 is equal to 4. Corresponding to Figure 5.3, we can see that the third round of identification split further into four Q adjust. Similar to the first two rounds of identification, Table 5.5 shows the third round of identification. In this round, some tags identification are completed using variable V 1 = 2.9 and 3.0. However, identification using variable V 1 between 2.0 to 2.8 required further recognition. Q adjust for all variable except V 1 = 2.2 is equal to 3. Figure 5.3 shows that the fourth round of identification split further into four Q adjust. More identification is completed within the fourth round as shown in Table 5.6. Only variable V 1 between 2.0 to 2.2 required further identification. Figure 5.3 shows that the fifth round only split into two Q adjust. In the last round of identification, all tags can be recognised using variable V 1 = 2.0, 2.1, and 2.2; as shown in Table

125 5.2. PRECISE TAG ESTIMATION SCHEME Table 5.5: Sample tag estimation and frame-size (Q) adjustment after the third round of identification, using PTES[C] method Round three (c = 3) Variable (V 1 ) Tag Estimation Q Adjust Round three (c = 2) Variable (V 1 ) Tag Estimation Q Adjust Round three (c = 0) Variable (V 1 ) Tag Estimation Q Adjust Completed Completed Table 5.6: Sample tag estimation and frame-size (Q) adjustment after the fourth round of identification, using PTES[C] method Round four (c = 1) Variable (V 1 ) Tag Estimation Q Adjust Round four (c = 0) Variable (V 1 ) Tag Estimation Q Adjust Completed Completed Completed Completed Completed Completed Table 5.7: Sample tag estimation and frame-size (Q) adjustment after the fifth round of identification, using PTES[C] method Round five (c = 0) Variable (V 1 ) Tag Estimation Q Adjust Completed Completed Completed Since samples tag identification using PTES[C] mostly explain how our proposed methods estimated number of tags in each round, we will show the actual tag allocation using PTES[CE] in the next sub-section. 99

126 CHAPTER 5. PROBABILISTIC ANTI-COLLISION APPROACHES Sample Tag Allocation - PTES[CE] After the first round of identification shown in Figure 5.2, PTES[CE] method uses variable V 1 between 2.0 to 3.0, to estimate collision slots; and variable V 2 between 0.1 to 0.9, to estimate empty slots for the next round. The actual remaining tags from this round are fifteen tags. For example, after applying Equation 5.2 using V 1 = 2.0 and V 2 = 0.5, number of estimated tags for the next round can be calculated as follows: Backlog = ( ) = 16 Therefore, the estimated number of tags for the next round is equal to sixteen tags. Hence, the new Q adjust is equal to 4 (see Table 5.1 for suggested frame-size). Following the first round, the adjustment of Q value for estimated tag between 12 and 16 is equal to 4, while the Q value for estimated tag between 17 and 25 is equal to 5. In order to identify all tags within the interrogation zone, PTES[CE] with variable V 1 and V 2 is applied for each identification round until no more collision occurs and all tags are identified. Figure 5.4: A sample second round of tag allocation with Initial Q of 4, V 1 = 2.0, and V 2 = 0.5. Collision slot c = 2, Empty slot e = 5, and Successful slot s = 9 Figures 5.4, 5.5, and 5.6 show examples of further identification process using variables V 1 = 2.0 and V 2 = 0.5. Figure 5.4 shows a sample of second round tag allocation where two collision slots, five empty slots, and nine successful slots, occurred. The actual remaining tags from this round are six tags. The number of estimated tag for round three can be calculated as follows: Backlog = ( ) = 7 Therefore, the estimated number of tags for the next round is equal to seven tags. The new Q adjust is equal to 3. Figure 5.5 shows a sample of third round tag allocation where one collision slot, three empty slots, and four successful slots, occurred. The actual remaining tags from this round are two tags. The number of estimated tag for round three can be calculated as follows: 100

127 5.2. PRECISE TAG ESTIMATION SCHEME Figure 5.5: A sample third round of tag allocation with Initial Q of 3, V 1 = 2.0, and V 2 = 0.5. Collision slot c = 1, Empty slot e = 3, and Successful slot s = 4 Backlog = ( ) = 4 The estimated number of tags for the next round is equal to four tags. Therefore, the new Q adjust is equal to 2. Figure 5.6: A sample fourth (final) round of tag allocation with Initial Q of 2, V 1 = 2.0, and V 2 = 0.5. Collision slot c = 0, Empty slot e = 2, and Successful slot s = 2 Figure 5.6 shows a sample of final round tag allocation where no collision slot, two empty slots, and two successful slots, occurred. There are no more tag remaining since no collision occurred; thus, the identification process using probabilistic anti-collision algorithm terminated after this round Experimental Evaluation In order to show the significance of our proposed PTES methods, we conducted two experimental evaluations and compared our methods with existing techniques Preliminary To study the Precise Tag Estimation Scheme, all experiments are assumed to be set up in a well-controlled environment where there is no metal or water nearby. We randomly generated all data sets with assumptions that a UHF RFID reader is used and passive RFID tags are attached to each item. At this stage, we assume that all items are static and no other type of interference beside collision itself is presented. It is also assumed that other type of data stream errors, such as data duplication, have been filtered at the 101

128 CHAPTER 5. PROBABILISTIC ANTI-COLLISION APPROACHES earlier stage. Different tag sets are simulated for each experiment. While performing each anti-collision algorithm, number of tags is supposedly unknown. We performed ten runs on each test case and presented the average results Experiment One Data Set The aim of the first experiment is to find the impact on system efficiency of different initial Q used by PTES method versus existing methods. There are 200 and 300 tags utilised in the experiment. Different initial Q of 6, 7, 8 and 9 are applied on each tag set. The data sets and initial Q parameters are selected, based on the supposition of the capability of the UHF RFID reader and passive tags read rates. Table 5.8 demonstrates the type of Backlog prediction applied to different number of tags, different initial Q, and tunable parameters. All methods are applied separately to different randomly generated data sets, giving a total of 1688 test cases within this experiment. Table 5.8: Chosen Parameters for Experiment One Backlog Prediction No. Tags V 1 V 2 Initial Q Total Test Case PTES[C] PTES[CE] PTES[CCE] Shoute Lowerbound Experiment Two Data Set The second experiment is to find the optimal parameters of PTES that produce the minimal number of slots and frames, and generate the highest system efficiency, compared with the existing methods. There are five tag sets comprising 100, 200, 300, 400, and 500 tags. The initial Q of this experiment is fixed to 8. Table 5.9 displays the type of Backlog prediction applied to different number of tags, fixed initial Q, and tunable parameters. All methods are applied separately to different randomly generated data sets, giving a total of 1055 test cases within this experiment. Table 5.9: Chosen Parameters for Experiment Two Backlog Prediction Initial Q V 1 V 2 No. Tags Total Test Case PTES[C] PTES[CE] PTES[CCE] Schoute Lowerbound

129 5.2. PRECISE TAG ESTIMATION SCHEME Results This section presents the evaluation of impacts of different initial Q on chosen frame-size prediction methods, and demonstrates the impacts of different number of tags within an interrogation zone toward anti-collision approaches. These results are displayed as follows: Results - Impacts of Different Initial Q In the first experiment, we compared our PTES algorithm with Schoute (Sch) and Lowerbound (LB) methods. In accordance to our surveys, the two methods are simple and give accurate Backlog prediction. We divide our results into two parts. In the first part, we present results on PTES[C] method; and for the second part, we demonstrate results on PTES[CE] and PTES[CCE] methods. We present results separately for PTES[C] because it is the only method that involved single parameter (V 1 ), while the other two PTES methods occupied both parameters V 1 and V 2. Figure 5.7: Performance efficiency of PTES[C], Sch, and LB methods, using different Initial Q: a) PTES[C] 200 tags and b) PTES[C] 300 tags Part I Experiment results, as shown in Figure 5.7, illustrate that different initial Q have individual impact on system efficiency for PTES[C] method. All PTES[C] s parameters (V to 3.0) are displayed in the figure. Considering different number of tags (See Figure 5.7a and 5.7b), the results show that initial Q of 8 is the optimal Q for all approaches. According to both figures, the maximum performance efficiency using different methods, including Sch, LB, and PTES[C] methods, can be achieved when the initial Q is set to

130 CHAPTER 5. PROBABILISTIC ANTI-COLLISION APPROACHES Theoretically, the optimal performance efficiency can be achieved when the number of tags is equal to the size of the frame length. This is proven to be true in Figure 5.7a), where the optimal performance was reached with initial Q of 8 regardless of any method applied. When the number of tags is equal to 200 tags, Q = 8 is the best candidate since it is the nearest frame length available. Similarly, when the number of tags increased to 300 tags, initial Q of 8 is still the best option as it has the closest frame length. Figure 5.8: Performance efficiency of PTES[CE], PTES[CCE], Sch, and LB methods, using different Initial Q: a) PTES[CE] 200 tags, b) PTES[CE] 300 tags, c) PTES[CCE] 200 tags and d) PTES[CCE] 300 tags 104

131 5.2. PRECISE TAG ESTIMATION SCHEME Part II From Figure 5.8, it can be seen that different initial Q also have individual impact on system efficiency for both PTES[CE] and PTES[CCE] methods. The best parameters V 1 between 2.0 to 2.5 and V 2 between 0.1 to 0.2 are displayed in the figure. Similar to PTES[C], Figure 5.8 illustrates that the maximum performance efficiency using different methods (Sch, LB, PTES[CE], and PTES[CCE]) can be achieved when the initial Q is 8. In addition, it is noticeable that most variables of PTES[CCE] perform more stably compared with the PTES[CE] approach. For instance, Figure 5.8c) shows more lines overlap than Figure 5.8a), which means that all parameters of PTES[CCE] generate more constant results than the PTES[CE]. The challenge of choosing the initial Q is to take into consideration the amount of actual tags, which will be presented to the reader at the beginning of the read cycle. From results of both parts of the first experiment, we verified that the initial Q of 8 is the most suitable Q on average for our proposed PTES methods. Nevertheless, the selected initial Q is mainly appropriate for PTES method that is incorporated with any probabilistic anti-collision approach, without grouping strategy. For other probabilistic anti-collision methods that involved grouping rules, such as EDFSA and PCT, different initial Q may be more suitable for specific tag groups Results - Impacts of Different Number of Tags The second experiment verifies that different number of tags also have impacts on performances of each anti-collision techniques. Initial Q of 8 is used in this experiment since it gives maximum performance efficiency according to the first experiment. We compared our PTES algorithm with Sch method since it is simple and gives accurate Backlog prediction. The LB method is not considered in this experiment due to the excessive number of frames required from initial test case compared with Schoute method. In this experiment, we also divide our results into two parts. In the first part, we present results on PTES[C] method; and for the second part, we demonstrate results on PTES[CE] and PTES[CCE] methods. Part I To measure the performance efficiency of PTES[C] approach, we performed testing on data sets of 100 to 500 tags, and compared the results against Sch method. From Table 5.10, it can be seen that when the number of tags is equal to 200 tags, every anticollision methods achieved highest system efficiency. Relatively to the first experiment, this result also validates the Binomial distribution fundamental where the optimal efficiency can be obtained if the frame-size is equal to the number of tags. Nevertheless, for 200 to 500 tags, all methods maintain their system efficiency above 30 percent. The Table also demonstrates that the optimal parameters V 1 for PTES[C] is 2.5, where these parameters give highest system efficiency compared with other variables. In addition, pa- 105

132 CHAPTER 5. PROBABILISTIC ANTI-COLLISION APPROACHES Table 5.10: Performance efficiency of PTES[C], Sch, and LB methods, using Initial of 8 on different sets of tags Efficiency Anti-Collision Approaches 100 tags 200 tags 300 tags 400 tags 500 tags PTES[C]: PTES[C]: PTES[C]: PTES[C]: PTES[C]: PTES[C]: PTES[C]: PTES[C]: PTES[C]: PTES[C]: PTES[C]: Schoute rameters V 1 of 2.3 and 2.4 give equivalent efficiency for all tag sets compared with Sch method. Since Sch method uses 2.39 for variable V 1, it is rational for parameters 2.3 and 2.4 to perform correspondingly to Sch method. However, our results show that 2.5 is the optimal V 1 parameter, which contradict with Sch method that claims 2.39 as its optimal value. We verify this result as dependent upon the chosen initial Q in our experiment. However, it can be assumed that Sch method considers its performance toward all Qs, while we only consider optimal Q in our case. Since initial Q is tunable to the user s favour, we decided that it is only necessary to find optimal V 1 for specific initial Q. Figure 5.9: Performance efficiency (a) and Number of frames (b) of PTES[C] (V 1 = 2.3 to 2.5) versus Sch methods, using Initial Q of 8 on different tag sets 106

133 5.2. PRECISE TAG ESTIMATION SCHEME After considering performance efficiency, we also look into total number of frames produced by each method. The number of frames determines the initiative time in each identification cycle; the higher the number of frame means the longer identification delay. According to Figure 5.9b), it can be seen that PTES[C] with the optimal parameter V 1 of 2.5 also used the lowest number of frames for all number of tags. Judging from both performance efficiency (Figure 5.9a) and number of frames (Figure 5.9b) queried, we conclude that PTES[C] with optimal parameters V 1 = 2.5 and optimal initial Q of 8 can achieve the best system efficiency compared with Sch method. Table 5.11: Performance efficiency of PTES[CE], PTES[CCE], Sch, and LB methods, using Initial of 8 on different sets of tags Efficiency Anti-Collision Approaches 100 tags 200 tags 300 tags 400 tags 500 tags PTES[CE]: 2.0, PTES[CE]: 2.0, PTES[CE]: 2.1, PTES[CE]: 2.1, PTES[CE]: 2.2, PTES[CE]: 2.2, PTES[CE]: 2.3, PTES[CE]: 2.3, PTES[CE]: 2.4, PTES[CE]: 2.4, PTES[CE]: 2.5, PTES[CE]: 2.5, PTES[CCE]: 2.0, PTES[CCE]: 2.0, PTES[CCE]: 2.1, PTES[CCE]: 2.1, PTES[CCE]: 2.2, PTES[CCE]: 2.2, PTES[CCE]: 2.3, PTES[CCE]: 2.3, PTES[CCE]: 2.4, PTES[CCE]: 2.4, PTES[CCE]: 2.5, PTES[CCE]: 2.5, Schoute Part II From Table 5.11, it can be seen that for 200 to 500 tags, all methods including Sch, PTES[CE], and PTES[CCE], maintain their performance efficiency above 30 percent. The Table also demonstrates that both PTES[CE] and PTES[CCE] with parameters V 1 and V 2 equal to 2.3 and 0.1 respectively, give the highest system efficiency. The two PTES methods perform equivalently for all data sets except for 400 tag sets. 107

134 CHAPTER 5. PROBABILISTIC ANTI-COLLISION APPROACHES Specifically, looking closely at Figure 5.10a), we can see that PTES[CE] method obtained lower efficiency for 400 tags compared with Sch method, despite the optimal parameters used. On the other hand, we can see from Figure 5.10c) that the PTES[CCE] still maintains its performance against Sch method regardless of number of tags. From these results, we now summarise that for both PTES[CE] and PTES[CCE], parameters V 1 = 2.3 and V 2 = 0.1 can achieve the best system efficiency. Nevertheless, the PTES[CCE] has the most stable performance compared with other approaches. Figure 5.10: Results of PTES[CE] and PTES[CCE] (V 1 = 2.3, V 2 = 0.1) versus Sch methods using Initial Q of 8 on different tag sets: Performance efficiency (a: PTES[CE], c: PTES[CCE]) and Number of frames (b: PTES[CE], d: PTES[CCE]) After considering performance efficiency, we are now looking into total number of frames produced by each method. According to Figure 5.10b) and Figure 5.10d), it can be seen that PTES[CE] and PTES[CCE] with optimal parameters, also used the lowest number of frames for all number of tags. These results are far more superior, in terms of 108

135 5.3. PROBABILISTIC CLUSTER-BASED TECHNIQUE performance and number of frames required, compared with the performance of PTES[C] discussed in Part I. This is due to the fact that the PTES[C] only used one variable that only consider the impact of collision slots, while the other two PTES methods also take into consideration the impact of empty slots from previous frames. Justified by both system efficiency and number of frames, we conclude than PTES[CCE], with parameters V 1 = 2.3 and V 2 = 0.1 and optimal initial Q of 8, is the best approach out of the three proposed PTES. Therefore, PTES[CCE] approach is chosen as a frame-size prediction for DFSA, EDFSA, and PCT methods for our PCT experiments. 5.3 Probabilistic Cluster-Based Technique The PCT method employs a dynamic probabilistic algorithm concept, and uses groupsplitting rule, to split Backlog into group if the number of unread tags is higher than the maximum frame-size. The PCT approach first estimates the number of Backlog, or the remaining tags, within the interrogation zone. If the number of Backlog is larger than the specific frame-size, it splits the number of Backlog into a number of groups and allows only one group of tags to respond. The reader then issues a Query, which contains a Q parameter to specify the frame-size (frame-size F(min) = 0; F(max) = 2 Q - 1). Each selected tag in the group will pick a random number between 0 to 2 Q - 1 and put it into its slot counter. Only the tag that picks zero as its slot counter responds to the request. When the number of estimated Backlog is below the threshold, the reader adjusts the frame-size without grouping the unread tags. After each read cycle, the reader estimates the number of Backlog using the PTES algorithm and adjusts its frame-size Probabilistic Anti-Collision Algorithm using PTES PCT approach first estimates the number of unread tags, then it decides if the number of tags needs to be spliced or not. The probabilistic anti-collision algorithm using PTES frame-size prediction is then applied to each selected group of tag. Algorithm 4 demonstrates the probabilistic anti-collision algorithm applied to each selected group of tags, where only one group of tags responds to the reader. There are three kinds of slot: 1. Successful slot: Where there is only one tag reply, the reader sends ACK(RN16) to a tag. The tag then backscatters its EPC to the reader and the reader issues QueryRep for the next slot. 2. Empty slot: Where there is no tag reply, the reader then issues QueryRep for the next slot. 109

136 CHAPTER 5. PROBABILISTIC ANTI-COLLISION APPROACHES Reader sends Query for (Identification procedure) do Every tags generate RN16 and slot counter; for (Current frame) do if (Slot counter == 0) then Tag replies its RN16; if (A single tag replies) then Reader sends ACK(RN16) to a tag; if (RN16 received by tag == RN16 tag saved data) then Tag sends (EPC+PC+CRC) to reader; end Reader sends QueryRep; end else if (Multiple tags reply) then Reader sends QueryRep; end else if (No tag replies) then Reader sends QueryRep; end end if (Tag receives QueryRep) then slot counter = slot counter - 1; end end Reader uses PTES algorithm to adjust the size of the new frame; Reader sends QueryAdjust; end Algorithm 4: Probabilistic anti-collision algorithm with PTES Frame-Size Prediction 3. Collision slot: Where there is more than one tag reply, the reader then issues QueryRep for the next slot. After QueryRep command is received, each tag decreases its slot counter by 1. At the end of each frame, the reader checks if all tags have been identified. Then, the reader estimates the number of Backlog using PTES algorithm, and adjust its frame-size PCT Preliminary Instead of splitting tags into group randomly, the PCT approach derived new rules using particular equations, according to the optimal system efficiency obtained for specific number of tags. We first conducted an experiment to acquire optimal frame-size for specific number of tags as shown in Figure It can be seen that the optimal system efficiency achieved by the probabilistic ALOHA method is approximately 38 percent and the optimal number of tags is close to the maximum frame-size. Efficiency is calculated as shown in Equation 5.3: 110

137 5.3. PROBABILISTIC CLUSTER-BASED TECHNIQUE Figure 5.11: Performance efficiency of different frame-size on different number of tags S Efficiency = ( S + C + E ) (5.3) Where S is the number of Successful slots, C is the number of Collision slots, and E is the number of Empty slots. From the results acquired for performance efficiency evaluation, we have developed equations 5.4, 5.5, 5.6, 5.7 and 5.8 to find a minimum and maximum number of tags suitable for particular frame-size. These minimum and maximum numbers of tags are derived to acquire the optimal performance efficiencies, as in Figure Each equation is then used to exploit rules for PCT. To show in detail, the derivation of equations 5.4, 5.5, 5.6, 5.7 and 5.8, Table 5.12 demonstrates given information found from Figure 5.11, and all missing fields. From Table 5.12, it is visible that at optimal system efficiency of 38 percent, the number of tags is equal to the available frame-size calculated by 2 Q. We have set the minimum and maximum boundary efficiencies at 33 percent. The information on maximum number of tags at 33 percent is also available from Figure For example, when Q is equal to 8, the optimal percentage efficiency can be obtain at 256 tags and the number of tags at maximum boundary is equal to 352 tags. 111

138 CHAPTER 5. PROBABILISTIC ANTI-COLLISION APPROACHES max = 2 Q + 2 (Q 1) 2 (Q 2) + 2 (Q 3) min = (2 (Q 1) + 2 (Q 2) 2 (Q 3) + 2 (Q 4) ) + 1 (5.4) max = (2 Q + 2 (Q 1) 2 (Q 2) + 2 (Q 3) ) + (2 (Q 2) + 2 (Q 3) 2 (Q 4) + 2 (Q 5) ) min = (2 Q + 2 (Q 1) 2 (Q 2) + 2 (Q 3) ) + 1 (5.5) max = (2 Q + 2 (Q 1) 2 (Q 2) + 2 (Q 3) ) + (2 (Q 1) + 2 (Q 2) 2 (Q 3) + 2 (Q 4) ) min = (2 Q + 2 (Q 1) 2 (Q 2) + 2 (Q 3) ) + 1 (5.6) max = (2 Q + 2 (Q 1) 2 (Q 2) + 2 (Q 3) ) + (2 (Q 1) + 2 (Q 2) 2 (Q 3) + 2 (Q 4) ) min = [(2 Q + 2 (Q 1) 2 (Q 2) + 2 (Q 3) ) + (2 (Q 2) + 2 (Q 3) 2 (Q 4) + 2 (Q 5) )] + 1 (5.7) max = (2 Q + 2 (Q 1) 2 (Q 2) + 2 (Q 3) ) + (2 (Q 1) + 2 (Q 2) 2 (Q 3) + 2 (Q 4) )+ (2 (Q 2) + 2 (Q 3) 2 (Q 4) + 2 (Q 5) ) min = [(2 Q + 2 (Q 1) 2 (Q 2) + 2 (Q 3) ) + (2 (Q 1) + 2 (Q 2) 2 (Q 3) + 2 (Q 4) )] + 1 (5.8) Figure 5.12 illustrates the minimum and maximum boundaries and their correlated percentage of efficiency for frame-size of 256. The figure shows that when the number of tags equal 177 and 352, the percentage of efficiency is equal to 33 percent. Table 5.12: Available Information and Missing fields on System Efficiency. MinB = Minimum point of occurrence, MaxB = Maximum point of occurrence Q MinB at 33% MaxB at 33% Optimal at 38% 9 Unknown Unknown Unknown Unknown Unknown Unknown Unknown 16 3 Unknown Unknown 8 2 Unknown Unknown 4 1 Unknown Unknown 2 Table 5.13 demonstrates the derived answers for missing fields from Table The minimum boundary with 33 percent efficiency is calculated by the maximum boundary of the previous frame plus 1. Thus, for Q8, the minimum boundary is equal to 177 ( ). After finding all information needed, the reverse engineered equations for maximum and minimum boundaries are derived for each Q. After we found all outcomes for each Q, it is now possible to find the equation for two or more type of Qs. In order to simplify the derived equations, we employ the use of β (Beta), κ (Kappa), and µ (Mu), and assigned these three icons to express each rule. In this research, we 112

139 5.3. PROBABILISTIC CLUSTER-BASED TECHNIQUE Figure 5.12: The minimum and maximum boundaries and their correlated percentage of efficiency for frame-size of 256 proposed three rules for PCT: PCT256, PCT128, and PCT-E (PCT-Extended). All rules split the number of Backlog into groups then used one of initial Q8 (frame-size 256), Q7 (frame-size 128), or Q6 (frame-size 64), to identify a current set of tags. Equation 5.9 shows the conversion of all three key sets, from equation 5.4 to 5.8, into β, κ, and µ. β = 2 Q + 2 (Q 1) 2 (Q 2) + 2 (Q 3) κ = 2 (Q 1) + 2 (Q 2) 2 (Q 3) + 2 (Q 4) µ = 2 (Q 2) + 2 (Q 3) 2 (Q 4) + 2 (Q 5) (5.9) From equation 5.4, 5.5, 5.6, 5.7 and 5.8, we derived three key sets within these equations. These key sets are converted into β, κ, and µ and applied into each PCT rule, as shown in Table Table 5.14 displays the conversion of equations 5.4 to 5.8, with the minimum and maximum boundaries for each rule. For instance, Equation 5.4 is applied to all three rules: PCT256, PCT128, and PCT-E. However, Equation 5.5 only apply to PCT-E. 113

140 CHAPTER 5. PROBABILISTIC ANTI-COLLISION APPROACHES Table 5.13: Derived Equations for Missing fields on System Efficiency. MinB = Minimum point of occurrence, MaxB = Maximum point of occurrence MinB Reverse Engineered Calculation of the Derived Q at 33% Equation for MinB Equation for MinB (2 (Q 1) + 2 (Q 2) - 2 (Q 3) + 2 (Q 4) ) + 1 ( ) + 1 = (2 (Q 1) + 2 (Q 2) - 2 (Q 3) + 2 (Q 4) ) + 1 ( ) + 1 = (2 (Q 1) + 2 (Q 2) - 2 (Q 3) + 2 (Q 4) ) + 1 ( ) + 1 = (2 (Q 1) + 2 (Q 2) - 2 (Q 3) + 2 (Q 4) ) + 1 ( ) + 1 = (2 (Q 1) + 2 (Q 2) - 2 (Q 3) + 2 (Q 4) ) + 1 ( ) + 1 = (2 (Q 1) + 2 (Q 2) - 2 (Q 3) + 2 (Q 4) ) + 1 ( ) + 1 = (2 (Q 1) + 2 (Q 2) - 2 (Q 3) + 2 (Q 4) ) + 1 ( ) + 1 = (2 (Q 1) + 2 (Q 2) - 2 (Q 3) + 2 (Q 4) ) + 1 ( ) + 1 = (2 (Q 1) + 2 (Q 2) - 2 (Q 3) + 2 (Q 4) ) + 1 ( ) + 1 = 2 MaxB Reverse Engineered Calculation of the Derived Q at 33% Equation for MaxB Equation for MaxB Q + 2 (Q 1) - 2 (Q 2) + 2 (Q 3) = Q + 2 (Q 1) - 2 (Q 2) + 2 (Q 3) = Q + 2 (Q 1) - 2 (Q 2) + 2 (Q 3) = Q + 2 (Q 1) - 2 (Q 2) + 2 (Q 3) = Q + 2 (Q 1) - 2 (Q 2) + 2 (Q 3) = Q + 2 (Q 1) - 2 (Q 2) + 2 (Q 3) = Q + 2 (Q 1) - 2 (Q 2) + 2 (Q 3) = Q + 2 (Q 1) - 2 (Q 2) + 2 (Q 3) = Q + 2 (Q 1) - 2 (Q 2) + 2 (Q 3) = Sample Boundary Computation To demonstrate the computation of each PCT rule, we initiate sample calculation of each rule that utilise equation 5.4 to 5.8 from Table Table 5.14: The conversion of PCT rules to β Beta, κ Kappa, and µ Mu PCT256 PCT128 PCT-E Max β Min κ + 1 (5.4) β + µ β + 1 (5.5) Max β + κ β + κ Min β + 1 (5.6) [β + µ] + 1 (5.7) β + κ + µ [β + κ] + 1 (5.8) Equation 5.4 computation In the case that only one type of Q, either 7 or 8 is applied during the identification cycle, Equation 5.4 is used to calculate a minimum and maximum number of tags. Therefore, for PCT256, PCT128 and PCT-E rules, we obtained the maximum and the minimum number of tags by rewriting Equation 5.4, as shown in Table 5.15, 5.17 and For instance, for PCT256 where Q = 8, Equation 5.4 can be rewritten as follows: 114

141 5.3. PROBABILISTIC CLUSTER-BASED TECHNIQUE max = β = (8 1) 2 (8 2) + 2 (8 3) = 352 min = κ + 1 = (2 (8 1) + 2 (8 2) 2 (8 3) + 2 (8 4) ) + 1 = 177 Therefore, for PCT256 after applying Equation 5.4, we obtained the maximum number of tags of 352 and the minimum number of tags of 177. Based on the same principle as PCT256, for PCT128 where Q = 7, Equation 5.4 can be rewritten as follows: max = β = (7 1) 2 (7 2) + 2 (7 3) = 176 min = κ + 1 = (2 (7 1) + 2 (7 2) 2 (7 3) + 2 (7 4) ) + 1 = 89 Thus, for PCT128 after applying Equation 5.4, we obtained the maximum number of tags of 176 and the minimum number of tags of Equation 5.5 and 5.7 computations Equations 5.5 and 5.7 are used to calculate a minimum and maximum number of tags for PCT rules, in the case that two types of Q, either 8 & 7 or 8 & 6 are used during the identification cycle. This rule only applies to PCT-E (See Table 5.14). After rewriting Equation 5.5 or 5.7, we obtained the maximum and the minimum number of tags as shown in Table For example, for PCT-E where Q = 8, Equation 5.5 can be rewritten as follows: 115

142 CHAPTER 5. PROBABILISTIC ANTI-COLLISION APPROACHES max = β + µ = ( (8 1) 2 (8 2) + 2 (8 3) ) + (2 (8 2) + 2 (8 3) 2 (8 4) + 2 (8 5) ) = 440 min = β + 1 = ( (8 1) 2 (8 2) + 2 (8 3) ) + 1 = 353 Thus, for PCT-E after applying Equation 5.5, we obtained the maximum number of tags of 440 and the minimum number of tags of 353. In addition, for PCT-E where Q = 8, Equation 5.7 can be rewritten as follows: max = β + κ = ( (8 1) 2 (8 2) + 2 (8 3) ) + (2 (8 1) + 2 (8 2) 2 (8 3) + 2 (8 4) ) = 528 min = β + µ + 1 = ( (8 1) 2 (8 2) + 2 (8 3) ) + (2 (8 2) + 2 (8 3) 2 (8 4) + 2 (8 5) ) + 1 = 441 Therefore, for PCT-E after applying Equation 5.7, we obtained the maximum number of tags of 528 and the minimum number of tags of Equation 5.6 computation For the case where two types of Q, either 8 & 7 or 7 & 6 are used during the identification cycle, Equation 5.6 is used to calculate a minimum and maximum number of tags for PCT rules. This rule only applies to PCT256 and PCT128 but does not apply to PCT-E (See Table 5.14). After rewriting Equation 5.6, we obtained the maximum and the minimum number of tags as shown in Table 5.15 and For example, for PCT256 where Q = 8, Equation 5.6 can be rewritten as follows: 116

143 5.3. PROBABILISTIC CLUSTER-BASED TECHNIQUE max = β + κ = ( (8 1) 2 (8 2) + 2 (8 3) ) + (2 (8 1) + 2 (8 2) 2 (8 3) + 2 (8 4) ) = 528 min = β + 1 = ( (8 1) 2 (8 2) + 2 (8 3) ) + 1 = 353 Therefore, for PCT256 after applying Equation 5.6, we obtained the maximum number of tags of 528 and the minimum number of tags of 353. Also, for PCT128 where Q = 7, Equation 5.6 can be rewritten as follows: max = β + κ = ( (7 1) 2 (7 2) + 2 (7 3) ) + (2 (7 1) + 2 (7 2) 2 (7 3) + 2 (7 4) = 264 min = β + 1 = ( (7 1) 2 (7 2) + 2 (7 3) ) + 1 = 177 Therefore, for PCT128 after applying Equation 5.6, we obtained the maximum number of tags of 264 and the minimum number of tags of Equation 5.8 computation Equation 5.8 is used to calculate a minimum and maximum number of tags, in the case of three types of Q, 6, 7, and 8 are applied in PCT-E rule. After rewriting Equation 5.8, we obtained the maximum and the minimum number of tags as shown in Table For instance, for PCT-E where Q = 8, Equation 5.8 can be rewritten as follows: 117

144 CHAPTER 5. PROBABILISTIC ANTI-COLLISION APPROACHES max = β + κ + µ = ( (8 1) 2 (8 2) + 2 (8 3) ) + (2 (8 1) + 2 (8 2) 2 (8 3) + 2 (8 4) )+ (2 (8 2) + 2 (8 3) 2 (8 4) + 2 (8 5) ) = 616 min = [β + κ] + 1 = [( (8 1) 2 (8 2) + 2 (8 3) ) + (2 (8 1) + 2 (8 2) 2 (8 3) + 2 (8 4) )] + 1 = 529 Therefore, for PCT-E after applying Equation 5.8, we obtained the maximum number of tags of 616 and the minimum number of tags of 529. For both PCT256 and PCT-E rules, if the number of unread tags is larger than 352 and in order to achieve the optimal system efficiency, we must divide the tags into two or more groups. For the number of unread tags smaller than 352, we must let every unread tag responds. Similarly, for PCT128 rule, if the number of unread tags is larger than 176 and in order to achieve the optimal system efficiency, we must divide the tags into two or more groups. By doing this, we can always obtain the expected system efficiency as displayed in Figure PCT Rules PCT approach derived new rules using particular equations expressed by β Beta, κ Kappa, and µ Mu. All rules split the number of Backlog into groups then used one of Q8 (framesize 256), Q7 (frame-size 128), or Q6 (frame-size 64), to identify a current set of tags. We make the assumption that the performance efficiency can be improved by dividing tags into accurate number of groups, and then performing the tag identification separately for each group. In this research, we have chosen the frame-size of 256, 128, and 64 for our PCT rules since the initial Q of 8, 7 and 6 provide the most appropriate range for the current RFID reader and passive tags specification. Generally, the UHF reader is capable of capturing variety numbers of passive tags, depending on the reader type and tag class (e.g. Class 0: Read-only tag). Thus, selected initial Qs are the most suitable for our proposed rules. Each PCT rule, with the minimum and maximum boundaries, is explained as follows: 118

145 5.3. PROBABILISTIC CLUSTER-BASED TECHNIQUE PCT256 PCT256 uses either frame-size of 256 (Q = 8) or frame-size of 128 (Q = 7) for tag identification. We assume that the identification time and performance efficiency of our proposed PCT256 will advance from the existing probabilistic approaches. From the preliminary for all PCT rules, we obtained specific equations to calculate minimum and maximum boundaries for the PCT256 rule. These equations are applied as shown in Table Table 5.15: PCT256 Boundary Computation - number of group (Frame-Size 256 and 128), and minimum and maximum boundaries PCT256 Boundary Computation FS 256 FS 128 Minimum Bound Maximum Bound [3β + κ] + 1 4β 3 1 3β + 1 3β + κ 3 - [2β + κ] + 1 3β 2 1 2β + 1 2β + κ 2 - [β + κ] + 1 2β 1 1 β + 1 β + κ 1 - κ + 1 β Table 5.15 displays the relevant equations for minimum and maximum boundaries calculation for the PCT256 rule. From the Table, we can see that there are two framesize, 256 and 128, for grouping division. For example, the minimum boundary is calculated by 3β + 1 when the number of group division comprises three groups of 256 and one group of 128, and the maximum boundary is calculated by 3β + κ. Following the computation, the minimum and maximum boundaries are 1057 and 1232 respectively, as show in Table The detailed calculation is present as follows: max = 3β + κ = 3( (8 1) 2 (8 2) + 2 (8 3) ) + (2 (8 1) + 2 (8 2) 2 (8 3) + 2 (8 4) ) = 1232 min = 3β + 1 = 3( (8 1) 2 (8 2) + 2 (8 3) ) + 1 = 1057 After applying specific equations for each group division, Table 5.16 shows the final PCT rule for PCT256. For instance, if the number of Backlog equals to 900 tags, the PCT256 algorithm will split the unread tags into three groups of Q8 (256). Algorithm 5 demonstrates the group splitting algorithm using PCT256 rule, and either keep tag in a single group or split tag into number of groups according to PCT256 rule. 119

146 CHAPTER 5. PROBABILISTIC ANTI-COLLISION APPROACHES Table 5.16: PCT256 Rule - The number of unread tags, optimal frame-size (A and B), and number of group (A and B) PCT256 Rule Backlogs FS A Group A FS B Group B to to to to to to to to to to to to Input: T agcount Output: Number of Group for (Group Splitting procedure) do if Tagcount less than 353 tags then Keep tag into a single group; end else while Looking up PCT256 Rule Table do if Found Matched rule for specific Backlog then Split tags into groups; end end end Output number of groups; end Algorithm 5: Group Splitting Algorithm using PCT256 Rule PCT128 PCT128 uses either frame-size of 128 (Q = 7) or frame-size of 64 (Q = 6) for tag identification. The PCT128 contains higher number of groups in some cases, compared with the PCT256 method, which may result in worse performance efficiency for specific number of tags. We calculate minimum and maximum boundaries for the PCT128 rule according to specific equations. These equations are applied, as shown in Table Table 5.17 displays the relevant equations for minimum and maximum boundaries calculation for the PCT128 rule. From the Table, it can be seen that there are two framesize, 128 and 64, for grouping division. For example, the minimum boundary is calculated by 5β + 1 when the number of group division comprises five groups of 128 and one group 120

147 5.3. PROBABILISTIC CLUSTER-BASED TECHNIQUE Table 5.17: PCT128 Boundary Computation - number of group (Frame-Size 128 and 64), and minimum and maximum boundaries PCT128 Boundary Computation FS 128 FS 64 Minimum Bound Maximum Bound [7β + κ] + 1 8β 7 1 7β + 1 7β + κ 7 - [6β + κ] + 1 7β 6 1 6β + 1 6β + κ 6 - [5β + κ] + 1 6β 5 1 5β + 1 5β + κ 5 - [4β + κ] + 1 5β 4 1 4β + 1 4β + κ 4 - [3β + κ] + 1 4β 3 1 3β + 1 3β + κ 3 - [2β + κ] + 1 3β 2 1 2β + 1 2β + κ 2 - [β + κ] + 1 2β 1 1 β + 1 β + κ 1 - κ + 1 β of 64, and the maximum boundary is calculated by 5β + κ. Following the computation, the minimum and maximum boundaries are 881 and 968 respectively, as show in Table Table 5.18: PCT128 Rule - The number of unread tags, optimal frame-size (A and B), and number of group (A and B) PCT128 Rule Backlogs FS A Group A FS B Group B to to to to to to to to to to to to to to to to to to to

148 CHAPTER 5. PROBABILISTIC ANTI-COLLISION APPROACHES Table 5.18 shows the PCT rule for PCT128. For instance, if the number of Backlog equals to 900 tags, the PCT128 algorithm will split the unread tags into five groups of Q7 (128) and one group of Q6 (64). Algorithm 6 demonstrates the group splitting algorithm using PCT128 rule, and either keep tag in a single group or split tag into number of groups according to PCT128 rule. Input: T agcount Output: Number of Group for (Group Splitting procedure) do if Tagcount less than 177 tags then Keep tag into a single group; end else while Looking up PCT128 Rule Table do if Found Matched rule for specific Backlog then Split tags into groups; end end end Output number of groups; end Algorithm 6: Group Splitting Algorithm using PCT128 Rule PCT-Extended The rules of PCT-Extended (PCT-E) are more complex than the PCT256 and PCT128. This is because the PCT-E identifies tags using three different frame-size of 256 (Q = 8), 128 (Q = 7), and 64 (Q = 6) instead of two. We assume that the performance efficiency of PCT-E can improve further from the PCT256. However, the identification time may increase due to the higher number of group applied in each identification round. From the preliminary for all PCT rules, we obtained specific equations to calculate minimum and maximum boundaries for the PCT-E rule. These equations are applied as shown in Table Table 5.19 presents the relevant equations for minimum and maximum boundaries calculation for the PCT-E rule. From the Table, it is shown that there are three framesize, 256, 128 and 64, for grouping division. For instance, the minimum boundary is calculated by [2β + κ] + 1 when the number of group division comprises two groups of 256, one group of 128, and one group of 64; and the maximum boundary is calculated by 2β + κ + µ. Following the computation, the maximum and minimum boundaries are 881 and 968 respectively, as shown in Table Table 5.20 displays the PCT-E rule. For instance, if the number of Backlog equals to 900 tags, the PCT-E algorithm will split the unread tags into two groups of Q8 (256), one group of Q7 (128), and one group of Q6 (64). 122

149 5.3. PROBABILISTIC CLUSTER-BASED TECHNIQUE Table 5.19: PCT-E Boundary Computation - number of group (Frame-Size 256, 128 and 64), and minimum and maximum boundaries PCT-E Boundary Computation FS 256 FS 128 FS 64 Minimum Bound Maximum Bound [3β + κ + µ] + 1 4β [3β + κ] + 1 3β + κ + µ [3β + µ] + 1 3β + κ 3-1 3β +1 3β + µ [2β + κ + µ] + 1 3β [2β + κ] + 1 2β + κ + µ [2β + µ] + 1 2β + κ 2-1 2β +1 2β + µ [β + κ + µ] + 1 2β [β + κ] + 1 β + κ + µ [β + µ] + 1 β + κ 1-1 β + 1 β + µ κ + 1 β Table 5.20: PCT-E Rule - The number of unread tags, optimal frame-size (A, B, C), and number of group (A, B, C) PCT-E Rule Backlogs FS A Group A FS B Group B FS C Group C to to to to to to to to to to to to to to to to to to Algorithm 7 demonstrates the group splitting algorithm using PCT-E rule, and either keep tag in a single group or split tag into number of groups according to PCT-E rule. 123

150 CHAPTER 5. PROBABILISTIC ANTI-COLLISION APPROACHES Input: T agcount Output: Number of Group for (Group Splitting procedure) do if Tagcount less than 353 tags then Keep tag into a single group; end else while Looking up PCT-E Rule Table do if Found Matched rule for specific Backlog then Split tags into groups; end end end Output number of groups; end Algorithm 7: Group Splitting Algorithm using PCT-E Rule Experimental Evaluation In order to show the significance of our proposed PCT method, we conducted an experimental evaluation and compared our methods to existing techniques Preliminary To study the Probabilistic Cluster-Based Technique, the experiment is assumed to be under the same environment as for the experimental evaluation of the PTES (Section 5.2) Experiment Data Set The aim of the experiment is to compare the performance of our proposed PCT method to the existing probabilistic DFSA and EDFSA anti-collision approaches. In this experiment, we considered different number of tags, from 100 to 1400, within the interrogation zone. The number of simulated tags are assumed to be no more than 1400 tags, due to maximum range of UHF reader and passive tags. For each identification round, optimal tunable parameters of PTES[CCE] is applied on different Initial Q. For instance, when PCT initial a new frame with Initial Q of 8, the tunable parameter V 1 is set to 2.3 and parameter V 2 is set to 0.1. Table 5.21 displays the type of ALOHA-based anti-collision methods applied to different number of tags and tunable initial Qs. All methods are applied separately to different randomly generated data sets, giving a total of 70 test cases (14 for each method) within this experiment. 124

151 5.3. PROBABILISTIC CLUSTER-BASED TECHNIQUE Table 5.21: Chosen Parameters for Experiment Three Initial Q Backlog Prediction No. Tags DFSA 8 PTES[CCE] 100 to 1400 EDFSA 8 PTES[CCE] 100 to 1400 PCT256 tunable between 7, 8 PTES[CCE] 100 to 1400 PCT128 tunable between 6, 7 PTES[CCE] 100 to 1400 PCT-E tunable between 6, 7, 8 PTES[CCE] 100 to Results Our experiment evaluates the performance of our proposed PCT method to existing DFSA and EDFSA approaches. Corresponding between Table 5.22 and Figure 5.13a), it can be seen that both PCT256 and PCT-E produced minimal number of slots during identification process, compared with other methods. Specifically, PCT256 and PCT-E technique minimised the number of slots from EDFSA approach when the number of tags is between 400 and 500, and between 800 and 1200 tags. This is because the number of group sets for EDFSA will be doubled when the number of Backlog reached the specific threshold; while PCT increased number of group slowly, according to the estimated number of unread tags. As a result, the number of slots are minimised for PCT256. On the other hand, PCT128 performed better than DFSA but did not outperform the EDFSA. According to optimal efficiency displayed in Figure 5.11, the initial Q of 8 (frame-size = 256) has a wider range of optimal efficiency compared with the initial Q of 7. Therefore, PCT256 with initial frame-size of 256, has a better performance than the PCT128 with initial frame-size of 128. Table 5.22: Number of slots comparison and Performance efficiency for DFSA, EDFSA, PCT128, PCT256, and PCT-E methods on different number of tags Number of Slots Efficiency Tags D ED P128 P256 P-E D ED P128 P256 P-E Table 5.22 shows that there is no improvement to our proposed methods compared with existing methods when the number of tags are low (up to around 300 tags). This is because PCT methods start dividing tags into groups only when the number of tags 125

152 CHAPTER 5. PROBABILISTIC ANTI-COLLISION APPROACHES Figure 5.13: Number of slots comparison (a) and Performance efficiency (b) for DFSA, EDFSA, PCT128, PCT256, and PCT-E methods on different number of tags reaches the specific threshold. As a result, for certain tag sizes, the number of slots and performance efficiency remained unchanged due to the same identification procedure, compared with DFSA and EDFSA methods. Moreover, Table 5.22 also demonstrates that the PCT128 is the only method that has different results when the number of tags are 100 and 200 tags. This is due to the fact that PCT128 is the only method that uses initial frame-size of 128 to predict Backlog. Therefore, even when the number of tags is still low, the PCT128 starts splitting tags into group, resulting in different tag outcomes. Furthermore, when the number of tags is 100 tags, PCT128 shows the minimal number of slots issues. The reason for the outcome is because PCT128 uses frame-size of 7 instead 126

153 5.3. PROBABILISTIC CLUSTER-BASED TECHNIQUE of 8, as used in other methods. Thus, when the number of tags are as low as 100 tags, the PCT128 performs the best. However, we can see from Figure 5.13b) that the performance efficiency of PCT128 stabilises and does not improve any further when the number of tags increased. Table 5.22 and Figure 5.13b) show that both PCT256 and PCT-E maintained their system efficiency above other methods and have the most stable performance. Nevertheless, the PCT-E required additional number of group sets from the PCT256 method throughout the identification process (see Table 5.16 versus Table 5.20). As a result, the PCT-E required extra time to initiate a new group compared with the PCT256 method. On the other hand, the DFSA s efficiency dropped dramatically when the number of tags increase, while the EDFSA s efficiency become unstable during the time when number of groups doubled-up from 1 to 2 and from 2 to 4. The PCT128 has steady performance but does not perform as good as PCT256. Table 5.23: Percentage improvement of the proposed PCT128, PCT256, and PCT-E versus existing EDFSA (ED) and DFSA (D) techniques PCT128 PCT256 PCT-E improved improved improved improved improved improved from ED from D from ED from D from ED from D Average Table 5.23 demonstrates the percentage of improvement of the proposed PCT method versus EDFSA and DFSA methods. It can be seen that when the number of tags are low and the PCT methods have not divided these tags into groups, there are no difference between the outcome of our methods and existing methods, and the percentage of improvement remain unchanged. However, during the time when number of groups doubled-up from 1 to 2 (400 tags) and from 2 to 4 (800 tags), PCT256 and PCT-E show the highest percentage of improvement compared with the EDFSA method. On the other hand, the percentage of improvement increased more stably compared with the DFSA method, since the DFSA method does not imply group splitting rules. The PCT256 has a better performance than the EDFSA by about 4 percent on average, while it is approximately 11 percent better than the DFSA approach as demonstrated in 127

154 CHAPTER 5. PROBABILISTIC ANTI-COLLISION APPROACHES Figure 5.14: Percentage of improvement of PCT compared with DFSA and EDFSA methods Figure The optimal percentage of improvement of PCT256 method can achieve up to 14 percent and 21 percent compared with the EDFSA and DFSA respectively, depending on the number of tags within the interrogation zone. Nevertheless, the PCT-E method required additional number of groups from the PCT method, and acquired slightly lower percentage of improvement, compared with the PCT method. On the other hand, the PCT128 has a better performance than the DFSA method by around 6 percent on average, but does not show any improvement from the EDFSA technique, as displayed in Figure However, the PCT128 still shows some improvement in some cases and is able to achieve up to 16 percent compared with the EDFSA and DFSA methods. Therefore, we conclude that our proposed PCT256 method is the most effective method, in terms of system efficiency and number of slots minimisation. 5.4 Overall Analysis A total of three experiments were conducted for the ALOHA-based anti-collision approaches. The first two experimental evaluations are to verify the performance of our proposed Precise Tag Estimation Scheme and to identify the best tunable parameters for the method. The third experiment is to compare the performance of our proposed Probabilistic Cluster-Based Technique to existing ALOHA-based methods. From the first two experiments, we determined that the initial Q of 8 is the most suitable initial Q on average for our proposed PTES methods. Nevertheless, the selected initial Q is mainly appropriate for PTES method, incorporated with the probabilistic approach with no grouping rules. Additionally, we verified that PTES[CCE], with parameters V 1 = 2.3 and V 2 = 0.1 and 128

155 5.5. SUMMARY optimal initial Q of 8, is the best approach out of the three proposed PTES methods. In addition to the first two experiments, the final experiment demonstrated that our proposed PCT method is the most effective method in terms of system efficiency and number of slots minimisation. Specifically, the PCT256 gives the highest performance efficiency and outperforms existing ALOHA-based anti-collision methods. From the analysis of all experiments, we recognised certain properties of importance for ALOHA-based anti-collision methods, which are: 1) initial Q (frame-size) initialisation; 2) accuracy of Backlog prediction techniques; and 3) overall number of tags within the interrogation zone. 5.5 Summary In this chapter, we have investigated the problems on existing probabilistic anti-collision approaches and proposed a new frame-size estimation scheme, in order to predict a more precise frame-size to be used and incorporated with a probabilistic anti-collision technique. We also proposed a new probabilistic anti-collision method, to eliminate shortcomings of existing approaches. The main contributions and findings of this Chapter are as follows: We have proposed a Precise Tag Estimation Scheme (PTES) (Pupunwiwat and Stantic, 2010a), (Pupunwiwat and Stantic, 2010b), which is a method that estimates precise number of remaining Backlog by using information of collision slots and empty slots from the previous frame. We found that PTES with correctly tuned Q parameter and variable V 1 and V 2 can achieved optimal results and outperformed existing frame-size estimation techniques. We also proposed a Probabilistic Cluster-Based technique (PCT) (Pupunwiwat and Stantic, 2010d) to maximise efficiency of the tag identification process and to be incorporated with our proposed PTES. We discovered that PCT performed the best compared with existing ALOHA-based anti-collision techniques, regardless of number of tags within the interrogation zone. We have confirmed that the tunable initial Q, the accuracy of Backlog prediction techniques with correct variables, and the overall number of tags within the interrogation zone, have impacted on the performance of any ALOHA-based anti-collision schemes. Nevertheless, the best performing approach in terms of system efficiency and robustness of the RFID system, are our proposed probabilistic anti-collision techniques. 129

156

157 6 Conceptual Selective Technique Management In this chapter, we analyse our proposed deterministic and probabilistic anti-collision approaches and determine the best-fit method for specific circumstances. Firstly, we compared the Joined Q-ary Tree and the Probabilistic Cluster-Based Technique, and identified the performance and certain properties of importance for both anti-collision methods in general. Secondly, we proposed two new selective techniques management: 1) a Novel Decision Tree Strategy; and 2) a Six Thinking Hats Strategy (Pupunwiwat et al., 2011). We applied each selective technique toward the anti-collision method selection process, in order to find the optimal method for specific scenarios. The benefit from choosing correct anti-collision methods is that the Chain Reaction impact toward long-term RFID data management can be reduced. Finally, we formed a new concept and applicability of each type of anti-collision approach, then applied them to a sample real world scenario. The remaining of this chapter comprise the definition of Chain Reaction and why it is important in RFID data management; the comparative analysis of Joined Q-ary Tree versus Probabilistic Cluster-Based Technique; the foundation of two selective technique management; and the sample applicability of each anti-collision approach toward real world scenario. 6.1 Chain Reaction from Data Collection Process A chain reaction is a sequence of reactions where a reactive product or by-product causes additional reactions to take place. In a chain reaction, positive feedback leads to a selfamplifying chain of events. As for chain reaction toward RFID data management, the most important step that will have the largest impact toward data is the RFID data 131

158 CHAPTER 6. CONCEPTUAL SELECTIVE TECHNIQUE MANAGEMENT collection process. If any error occurs at the data collection level, the impact will increase all subsequent steps, such as data integration and aggregation; data query model and event processing; and data warehousing and data mining. Our main goal in this chapter is to identify which anti-collision method is the most suitable for particular circumstances. In order to do so, we need a precise selective technique management to classify which anti-collision scheme is the most suitable for specific scenario. After being able to optimise the anti-collision method selection process, the Chain Reaction from data collection process can be reduced toward long-term RFID data management. 6.2 Comparative Analysis of Deterministic and Probabilistic Techniques To compare and analyse our proposed anti-collision methods, we identified certain general properties of importance for anti-collision methods, they are: Tree-based methods Similarity of EPC pattern Number of tags within one group of the EPC pattern Overall number of tags within the interrogation zone ALOHA-based methods Initial Frame-size (Q) specification Accuracy of Backlog prediction techniques Overall number of tags within the interrogation zone In this study, we have empirically compared the performance of the Joined Q-ary Tree against the PCT anti-collision approach because our deterministic and probabilistic methods have outperformed existing techniques in their own grounds (Pupunwiwat and Stantic, 2010c,d). The Joined Q-ary Tree uses less resource, has no complexity in implementation, and needs low reader power and memory consumption, because it does not need to keep memory during identification. On the other hand, the Probabilistic Cluster-Based Technique works well in arbitrary situation, minimise resource used, and increase system efficiency, without the need for complex implementation. We believe that this comparative analysis is necessary to identify the best overall method for specific circumstances. 132

159 6.2. COMPARATIVE ANALYSIS OF DETERMINISTIC AND PROBABILISTIC TECHNIQUES Data sets There are two major test cases involved in our empirical evaluation, and these test cases have been generated separately. The first test case considers specific EPC patterns (same product) with 50 and 100 tags per pallet. The second test case, which has been used for probabilistic approaches, had no specific EPC pattern (different products) nor a specific number of tags per pallet. These two cases represent a typical situation in a warehouse environment. Our three main anti-collision schemes chosen for comparison are: 1) Joined Q-ary Tree, 2) PCT256 no group, and 3) PCT256. The PCT256 no group does not imply any group-splitting rule but still employ our proposed PTES as a tag estimation method. Since it is verified previously that our methods perform better than existing approaches, it is necessary to know which of our methods is the most suitable under particular condition and environment. The data sets are explained as follows: Joined Q-ary Tree method For Joined Q-ary Tree anti-collision approach, there are ten pallets of inventories in test case A, with each pallet contains 100 cases/tags, giving a total of 1000 tags. Similarly, test case B also contains 1000 tags but each pallet only holds 50 cases/tags. The GID-96 bits EPC encoding scheme is used with the EPC pattern from Table 6.1. Test case A: Joined Q-ary Tree with 100 tags per pallet (Joined(100)) - 10 pallets, 100 cases each, total 1000 tags Test case B: Joined Q-ary Tree with 50 tags per pallet (Joined(50)) - 20 pallets, 50 cases each, total 1000 tags Table 6.1: Chosen EPC Pattern of Tree-based anti-collision methods for Comparative Analysis EPC Pattern (GID-96) H GMN 104,426,055 OC [9,872,273-9,872,292] SN [26,292,755,245-26,292,755,344] PCT256 no group and PCT256 methods with PTES tag estimation For both probabilistic anti-collision approaches, we considered different number of tags, from 100 to 1000 tags. For each identification round, optimal tunable parameters of PTES[CCE] is applied on different Initial Q. Table 6.2 demonstrates the type of ALOHAbased anti-collision methods applied to different number of tags and tunable initial Qs. 133

160 CHAPTER 6. CONCEPTUAL SELECTIVE TECHNIQUE MANAGEMENT All methods are applied separately to different randomly generated data sets, giving a total of 20 test cases (20 for each method) within this test case. Table 6.2: Chosen Parameters of ALOHA-based anti-collision methods for Comparative Analysis Initial Q Backlog Prediction No. Tags PCT256 no group 8 PTES[CCE] 100 to 1000 PCT256 tunable between 7, 8 PTES[CCE] 100 to Comparative Analysis From the empirical study, we have investigated the performance of our proposed Joined Q-ary Tree and PCT256 (group and no group). From Table 6.3 and Figure 6.1a), it is shown that the number of slots of Joined Q-ary Tree increased linearly, depending on the number of tags within the interrogation zone. On the other hand, the number of slots of PCT256 methods increased with no specific pattern. This is due to the random nature of slots generation in ALOHA-based approaches. Figure 6.1a) illustrates that the difference in performance between each method increased with the increased number of tags, and this has particularly become visible when examining 1000 tags. The overall number of slot results have shown that the Joined Q-ary Tree with 100 tags per pallet (Joined(100)) has obtained the minimal number of slots throughout the whole experiment, which also obtained the shortest identification time required. In contrast, the Joined Q-ary Tree with 50 tags per pallet (Joined(50)) performed poorly compared with the Joined(100)and PCT256. These results have proven that the selection of the EPC pattern has a large impact on the performance of the Joined Q-ary Tree. When the chosen EPC pattern involved has a very small group of tags (such as 50 tags per pallet), the performance of Joined Q-ary Tree cannot be optimised. Figure 6.1a) also demonstrates that the PCT256 performed better than both PCT256 no group and Joined(50), but does not outperformed Joined(100). However, the PCT256 does not rely on the restriction of EPC pattern and can be applied to any set of tags with different encoding scheme. Moreover, the PCT256 no group also performs better than Joined(50) when the number of tags is lower than 500 tags, but began to worsen when the number of tags gets higher. Table 6.3 and Figure 6.1b) show the performance efficiency of all methods. It can be seen that the Joined(100) achieved close to 47 percent efficiency once the number of tags reach Additionally, we can see than the performance efficiency of both the Joined(100) and Joined(50) methods keep increasing, in accordance to the number of tags. In contrast, the PCT256 cannot achieve a performance efficiency higher than 38 percent. By examining Figure 6.1b), it can be assumed that the efficiency of the Joined Q-ary Tree will increase slowly once the number of tags within the interrogation zone becomes very 134

161 6.2. COMPARATIVE ANALYSIS OF DETERMINISTIC AND PROBABILISTIC TECHNIQUES Figure 6.1: Comparative analysis of deterministic versus probabilistic anti-collision methods: a) Number of slots comparison and b) Performance efficiency high. For the Joined(50), if the number of tags keeps increasing, it is possible that the performance efficiency will achieve the same level as PCT256. From the comparative analysis, we have identified certain properties of importance for anti-collision methods in general. For deterministic methods, we have discovered that there are impacts from similar EPC patterns; the number of tags within one group of the EPC pattern; and the overall number of tags within the interrogation zone. For probabilistic methods, we have determined that the performance of the anti-collision technique depends on the Initial frame-size (or the Q value) specification; the accuracy of Backlog 135

162 CHAPTER 6. CONCEPTUAL SELECTIVE TECHNIQUE MANAGEMENT Table 6.3: Number of slots and performance analysis for Joined Q-ary Tree (100 tags), Joined Q-ary Tree (50 tags), PCT256 no group, and PCT256 on different number of tags Number of Slots Efficiency Tags J100 J50 P256(N) P256 J100 J50 P256(N) P prediction techniques; and the overall number of tags within the interrogation zone. We conclude that the Joined Q-ary Tree method can achieve higher efficiency if the right EPC pattern is configured. However, for arbitrary situations where EPC pattern cannot be found, it is more preferable to use probabilistic approach rather than the deterministic method. 6.3 Strategies for Choosing Suitable Anti-Collision Techniques In this section, we clarify the importance of data collection process and why anti-collision method selection process is a very important step for real world applications. We introduce two novel strategies to choose the correct type of anti-collision algorithm for the right situation. Most past literature only focus on improving specific type of anti-collision technique, either deterministic or probabilistic; and also attempt to combine both schemes together. While several literature focus heavily on improving anti-collision method alone, there is no research done on how the data collection process can be optimised by employing the correct anti-collision method for the right business. In another word, we have been trying to improve something very eagerly without knowing how these improvements can benefit real life scenarios. Thus, we propose two novel strategies for optimal anti-collision method selection, which utilises Decision Tree (Investopedia, 2011) and the Six Thinking Hats Strategy (Bono, 2000). By selecting a correct anti-collision method for the business, the data collection procedure, which is the first and most important step in data management, can be optimised. Thus, the resources requirement, cost, and complexity of RFID system s implementation for data transformation, data security, and data organisation, can be minimised. Also, by selecting the correct anti-collision algorithm for a specific scenario, we do not need the most complex and expensive algorithm, to be able to get the most efficient collection of data. 136

6.3. STRATEGIES FOR CHOOSING SUITABLE ANTI-COLLISION TECHNIQUES 6.3.1 Novel Decision Tree for Anti-Collision Methods Selection There are always many factors that contribute to the outcome of the solution.

163 6.3. STRATEGIES FOR CHOOSING SUITABLE ANTI-COLLISION TECHNIQUES Novel Decision Tree for Anti-Collision Methods Selection There are always many factors that contribute to the outcome of the solution. A Decision Tree can be used to clarify and find an answer to a non-complex problem. The structure of Decision Tree allows users to take a problem with multiple possible solutions and displays it in a simple format that shows the relationship between different events or decisions. From literatures, a good Decision Tree can reach toward the same solution as complex Fuzzy Logic. For scenario where not many RFID locations and constraint are involved, it is wise to apply the Decision Tree to decide between either deterministic or probabilistic anti-collision protocols Novel Decision Tree Architecture In this study, we introduce the Novel Decision Tree Strategy for selective anti-collision technique management, where either Joined Q-ary Tree, PCT no group, or PCT group is applicable. PCT no group does not split tags into group as the number of tags may not be high enough to require the splitting. Certain properties of importance for anticollision methods discovered from comparative analysis are to be integrated with the decision-making progress. Figure 6.2: Novel Decision Tree for Anti-Collision Methods Selection 137

164 CHAPTER 6. CONCEPTUAL SELECTIVE TECHNIQUE MANAGEMENT Figure 6.2 illustrates the steps of the decision-making process of the proposed Novel Decision Tree. By taking certain properties found from our empirical study into consideration, we have constructed a decision tree that reflected on the size of the company, the number of tags per pallet, the total number of tags, the EPC pattern, and the relationship between the suppliers and consumers. For instance, A Local Pen Maker Company is a small company that produces any type of pens and exports them locally. The company packs a group of pens into a box, and then allocates them into pallets. Each box contains x pens and each pallet contains y boxes. We can use these provided information and travel through each step of the decision tree, to reach the final outcome Sample Scenarios using Decision Tree Selection Local Pen Maker Company (SME) A Local Pen Maker Company is a small company that produces any type of pens and exports them locally. The company packs a group of pens into a box, and then allocated them into pallets. Each box contains 30 pens and each pallet contains 5 boxes. Each pen is tagged with individual RFID passive tag. Total number of tags within a single interrogation zone equal to 600 tags (4 pallets). By using Decision Tree from Figure 6.2 for the final outcome, a suitable anti-collision method for a Local Pen Maker Company is a Deterministic Joined Q-ary Tree. A given threshold t for this scenario equals to 100 tags, and the procedures are as follows: Question: Is this a SME or a large Enterprise? Answer: SME Question: Is this an international corporation? Answer: No Question: Are all items from different sources? Answer: No Question: Is the number of tag in a single pallet exceeding 100 tags? Answer: Yes Outcome: The suitable anti-collision method is a Deterministic Joined Q-ary Tree Therefore, according to the decision tree outcome, a Local Pen Maker SME Company should employ a deterministic Joined Q-ary Tree, as its anti-collision method (Figure 6.3). Local Notebook Manufacturer (SME) A Local Notebook Manufacturer is a medium size company that produces any type of notebooks, note pads, writing pads; and then exports them locally. The company packs a group of notebooks into a box, and then allocates them into pallets. Each box contains 10 notebooks and each pallet contains 5 boxes. Each notebook is tagged with individual RFID passive tag. Total number of tags within a single interrogation zone equal to 250 tags (5 pallets). 138

6.3. STRATEGIES FOR CHOOSING SUITABLE ANTI-COLLISION TECHNIQUES Figure 6.3: Novel Decision Tree for Local Pen Maker Company (SME) Anti-Collision Methods Selection By using Decision Tree from Figure 6.

A given threshold t for this scenario equals to 100 tags and x equal to 300 tags. The procedures are as follows: Question: Is this a SME or a large Enterprise?

165 6.3. STRATEGIES FOR CHOOSING SUITABLE ANTI-COLLISION TECHNIQUES Figure 6.3: Novel Decision Tree for Local Pen Maker Company (SME) Anti-Collision Methods Selection By using Decision Tree from Figure 6.2 for the final outcome, a suitable anti-collision method for a Local Notebook Manufacturer is a Probabilistic Cluster-Based Technique with no grouping strategy. A given threshold t for this scenario equals to 100 tags and x equal to 300 tags. The procedures are as follows: Question: Is this a SME or a large Enterprise? Answer: SME Question: Is this an international corporation? Answer: No Question: Are all items from different sources? Answer: No Question: Is the number of tag in a single pallet exceeding 100 tags? Answer: No Question: Is the total number of tag in an interrogation zone exceeding 300 tags? Answer: No Outcome: The suitable anti-collision method is a Probabilistic Cluster-Based Technique without grouping strategy. Thus, according to the decision tree outcome, a Local Notebook Manufacturer should employ a PCT no group, as its anti-collision method (Figure 6.4). Figure 6.4: Novel Decision Tree for Local Notebook Manufacturer (SME) Anti-Collision Methods Selection International Stationery Enterprise An International Stationery Enterprise is a large business that imports any type of notebooks, pens, pencils, pencil cases; and then 139

CHAPTER 6. CONCEPTUAL SELECTIVE TECHNIQUE MANAGEMENT exports them internationally. The company packs a group of stationeries into a box, and then allocates them into pallets.

166 CHAPTER 6. CONCEPTUAL SELECTIVE TECHNIQUE MANAGEMENT exports them internationally. The company packs a group of stationeries into a box, and then allocates them into pallets. Each box contains different types of supplies and each pallet contains several boxes. Each stationery is tagged with individual RFID passive tag. Total number of tags within a single interrogation zone equals to 400 tags. By using Decision Tree from Figure 6.2 for the final outcome, a suitable anti-collision method for an International Stationery Enterprise is a Probabilistic Cluster-Based Technique. A given x for this scenario equals to 300 tags, and the procedures are as follows: Question: Is this a SME or a large Enterprise? Answer: Large Enterprise Question: Does the corporation involve trading? Answer: Yes Question: Is the Supplier to Consumer or Consumer to Supplier relationship a 1 to M relationship? Answer: No Question: Is the total number of tag in an interrogation zone exceeding 300 tags? Answer: Yes Outcome: Technique. The suitable anti-collision method is a Probabilistic Cluster-Based Therefore, according to the decision tree outcome, an International Stationery Enterprise should employ a PCT as its anti-collision method (Figure 6.5). Figure 6.5: Novel Decision Tree for International Stationery Enterprise Anti-Collision Methods Selection International A-Grade Filing and Storage Group An International A-Grade Filing and Storage Group is a large business that imports any type of filing and storage materials, and then exports them to a single local business. The company packs a group of materials into a box, and then allocates them into pallets. Each box contains specific type of supplies with 20 items and each pallet contains 6 boxes. Each product is tagged with individual RFID passive tag. Total number of tags within a single interrogation zone equals to 350 tags. By using Decision Tree from Figure 6.2 for the final outcome, a suitable anti-collision method for an International Stationery Enterprise, is an integration of both Deterministic 140

167 6.3. STRATEGIES FOR CHOOSING SUITABLE ANTI-COLLISION TECHNIQUES Joined Q-ary Tree and Probabilistic Cluster-Based Technique. A given threshold t for this scenario equal to 100 tags and x equals to 300 tags. The procedures are as follows: Question: Is this a SME or a large Enterprise? Answer: Large Enterprise Question: Does the corporation involve trading? Answer: Yes Question: Is the Supplier to Consumer or Consumer to Supplier relationship a 1 to M relationship? Answer: Yes Question: Is the number of tag in a single pallet exceeding 100 tags; and is the total number of tag in an interrogation zone less than 300 tags? Answer: No Question: Is the number of tag in a single pallet exceeding 100 tags; and is the total number of tag in an interrogation zone exceeding 300 tags? Answer: Yes Outcome: The suitable anti-collision methods are Deterministic Joined Q-ary Tree and Probabilistic Cluster-Based Technique. Therefore, according to the decision tree outcome, an International A-Grade Filing and Storage Group should employed both Joined Q-ary Tree and PCT as its anti-collision methods (Figure 6.6). Figure 6.6: Novel Decision Tree for International A-Grade Filing and Storage Group Anti- Collision Methods Selection Extended Solution for Complex Anti-Collision Methods Selection This section introduces an alternative technique to be used, instead of the Novel Decision Tree. It is possible that the Novel Decision Tree may not be the best for some complex cases; and the complex decision-making process, which involves more than fact and numbers, will be required in order to obtain the best anti-collision selection. There are several 141

168 CHAPTER 6. CONCEPTUAL SELECTIVE TECHNIQUE MANAGEMENT everyday decision-making techniques available in modern day. However, we must select the best technique that will allow the selective decision to be made precisely, and provide the best solution based on information, feeling, and experiences Everyday decision making techniques There are several existing everyday decision-making techniques available, which could possibly be applied to RFID anti-collision selection process. These techniques are described as follows: Pros and Cons: This technique lists the advantages and disadvantages of each option. Simple Prioritisation: This method selects the alternative with the highest probability-weighted utility for each alternative. Satisfaction: This technique accepts the first option that seems like it might achieve the desired result. The decision is made according to a person in authority or an expert. Flipism: Flipism includes decision making based on flipping a coin, cutting a deck of playing cards, and other random or coincidental methods. It is important to recognise the importance of anti-collision method selection. Thus, any existing techniques in everyday decision-making should be considered. However, each technique mentioned earlier only involves specific criteria in making the right decision. Therefore, we must provide alternative methods that combine all decision-making techniques together, in order to derive the best solution in anti-collision selecting process. From past literature, we found the Six Thinking Hats strategies to be a useful decisionmaking technique that include most decision-making process, and can be applied for effective complex anti-collision methods selection Six Thinking Hats Strategies In this concept, there are six metaphorical hats; and the thinker can put on or take off one of these hats to indicate the type of thinking used. Bono (2000) stated that putting on and taking off these hats is essential. The hats must never be used to categorise individuals, even though their behavior may seem to invite this (Bono, 2005, 2008). When done in groups, everybody wears the same hat at the same time. Figure 6.7 illustrates the Six Thinking Hats framework. The explanation of each hat is as follows: 142

169 6.3. STRATEGIES FOR CHOOSING SUITABLE ANTI-COLLISION TECHNIQUES Figure 6.7: Six Thinking Hats Framework White Hat: The White Hat takes care of facts and numbers by thinking neutral and objective, and by focusing on the data and information that are available or needed. Red Hat: This covers intuition, emotions, feelings, and hunches. The red hat allows the thinker to put forward an intuition without having to qualify or justify it. Black Hat: This is the hat of judgment and caution; and is a most valuable hat. The black hat covers negative aspects, for example, why something cannot be done. The black hat must always be logical. Yellow Hat: This is the logical positive, for example, why something will work and why it will offer benefits. It can be used when looking forward to the results of some proposed action, but can also be used to find something of value in what has already happened. Green Hat: This is the hat of creativity, alternatives, proposals, what is interesting, provocations, and changes. Blue Hat: This is the overview or process control hat. It looks not at the subject itself but at the thinking about the subject. The blue hat takes care of the control and the organisation of the process of thought. Also of the use of the other hats. 143

170 CHAPTER 6. CONCEPTUAL SELECTIVE TECHNIQUE MANAGEMENT To simplify the benefit of Six Thinking Hats strategies over existing everyday decisionmaking methods, the following points demonstrate the applicability of each hat toward existing everyday decision-making technique: Pros and Cons: Pros can be viewed through Yellow hat; and Cons can be viewed through Black hat. Simple Prioritisation: Simple Prioritisation can be examined through White hat, which only represents fact and decision that is made according to the verification of that fact. Satisfaction: This technique can be observed through White hat and Red hat, as the decision is made according to the fact from an expert, and the likelihood of the first option that seems like it might achieve the desired result. Flipism: Flipism solely based on luck and chances, thus, we classified this method under the Red hat category Six Thinking Hats for Complex Anti-Collision Methods Selection It is crucial that the RFID system must employ anti-collision protocols in readers, in order to enhance the integrity of the captured data. However, the step of choosing the right anticollision protocol is also very important, since we cannot depend solely on the capability of anti-collision protocol itself, but also on the suitability of each selected technique for the specific scenario. This is why the Novel Decision Tree alone is not sufficient because it is based solely on data figures. Therefore, we propose the Six Thinking Hats strategy for complex selective technique management to clarify the choice from the decision tree. The novelty of using Six Thinking Hats strategy and applying it for anti-collision selection is that, we will get the optimal and more precise outcome of anti-collision method selection for the specific scenario Preliminary The one great enemy of the thought is the complexity because it leads to confusion. When the thought is clear and simple, it is more pleasing and effective. Therefore, the concept of the Six Thinking Hats is to think simply. First, the thinker must simplify the thought by treating one thing later, rather than at the same time. The thinker can then face them by separating the emotions, the logic, the information, the hope, and the creativity. The intention of the Six Thinking Hats is to disassembly the thought, so that the thinker can use a way to think on one thing at a time, instead of doing all at the same time. For anti-collision selective process, each thinker must make decision separately. The Blue hat thinker, who control the selective process, will make final decision for best anticollision techniques for the specific scenario. 144

171 6.3. STRATEGIES FOR CHOOSING SUITABLE ANTI-COLLISION TECHNIQUES The following shows how the six hats can be applied toward complex anti-collision selective process: White hat - Facts & Information: When a thinker is wearing a White hat, he/she is neutral and objective. The thinker does not make interpretations nor gives opinions but instead imitate the computer that gives the facts and numbers. The thinker must obtain information and complete the emptiness of existing information. For anti-collision selective process, the thinker who wears White hat will rely solely on information given from various sources, such as a Novel Decision Tree; and will decide which anti-collision is most suitable for the current scenario. Red hat - Feelings & Emotions: The use of Red hat allows the thinker to feel visible so that they can become partly of map, and also of the system of values, that chooses the route in the map. The Red hat provides the thinker with an advisable method to enter and to leave the way emotionally. Thus, it allows the thinker to explore the feelings and never to make the attempt to justify the feelings or to base them on the logic. For anti-collision selective process, the thinker who wears Red hat will make decision based on hunch and guts feeling. Thus, it is important that the person who wear Red hat has strong relationship with the enterprise. Black hat - Being Cautious & Pessimistic: Thinking with Black hat takes care specifically of the negative judgment. The thinker of Black hat indicates what is bad, incorrect, and erroneous. The thinker indicates that something does not comply to the experience or to the knowledge accepted; and why something is not going to work. For anti-collision selective process, the thinker who wears Black hat points out the disadvantage of the selected anti-collision approach; and why it may be necessary to change to a different technique. Yellow hat - Being Positive & Optimistic: - Thought of Yellow hat is positive and constructive. The thinker of Yellow hat takes care of the positive evaluation, the same way that the thought of Black hat takes care of the negative evaluation. The thinker investigates and explores, in search of value and benefit and then find logical endorsement for this value and benefit. For anti-collision selective process, the thinker who wears Yellow hat point out the advantage of the selected anti-collision technique, and why it is necessary to keep current decision. Green hat - New Ideas & Alternatives: - The Green hat is for the creative thought. The search of alternatives is a fundamental aspect of the thought of Green hat. For anti-collision selective process, the thinker of Green hat will provide alternatives in the case of both deterministic and probabilistic algorithms, which have the same weight of positive and negative impacts. Blue hat - The Big Picture: - The thinker of Blue hat will be thinking about thinking, and set objective for each section. The Blue hat is the hat of the control. 145

CHAPTER 6. CONCEPTUAL SELECTIVE TECHNIQUE MANAGEMENT The thinker of Blue hat must organises the same thought, and to think about the thought necessary to investigate the subject.

172 CHAPTER 6. CONCEPTUAL SELECTIVE TECHNIQUE MANAGEMENT The thinker of Blue hat must organises the same thought, and to think about the thought necessary to investigate the subject. The thinker of Blue hat is like a conductor, who proposes and makes use of the other hats. For anti-collision selective process, the thinker of Blue hat will decide who to put on each hat, and what are the main scope of the overall selective process Global Trading Enterprise (GTE) Scenario Global Trading Enterprise (GTE) is a large international business, with Many-to-Many relationship between suppliers and consumers. GTE imports products from different countries then repackaged and exported them internationally and locally to different companies. The company involves large amount of inventories, which are stocked into special warehouse with four different zones as shown in Figure 6.8. Figure 6.8: Six Thinking Hats: Global Trading Enterprise (GTE) Scenario After analysing information given from Figure 6.8, Table 6.4 displays the preferred algorithm for each location that will provide the optimal quality of collected data from GTE scenario. 146

173 6.3. STRATEGIES FOR CHOOSING SUITABLE ANTI-COLLISION TECHNIQUES Table 6.4: Preferred Anti-Collision Method for Each Location (Zone 1-4) in GTE scenario Location Joined Q-ary Tree PCT Group PCT no Group Zone One X X Zone Two X X Zone Three X X Zone Four X X Decision Making Phase When applied Novel Decision Tree Strategy and Six Thinking Hats Strategy for GTE scenario, different conclusions for anti-collision methods deployment were acquired. The steps of decision-making processes are as follows: Question: Is this a SME or a large Enterprise? Answer: Large Enterprise Question: Does the corporation involve trading? Answer: Yes, GTE is a global trading company Question: Is the Supplier to Consumer or Consumer to Supplier relationship a 1 to M relationship? Answer: No, GTE have more than one supplier and consumer all over the world Question: Is the total number of tag in an interrogation zone exceeding x tags? Answer: Yes, GTE s warehouse stored numerous numbers of goods in storages and used RFID system to monitor and control inventories. Outcome: Technique. The suitable anti-collision method is a Probabilistic Cluster-Based According to the decision tree outcome, GTE should employ a PCT group as its anticollision method for all locations. Six Thinking Hats White Hat: For GTE scenario, the thinker who wears the White hat goes for realistic data and stays with the Novel Decision Tree assessment, which is to select the PCT deployment for all four zones. Red Hat: The Red hat is put on by local warehouse staff who knows the environment better than the board of directors. Thus, the Red hat wearer has decided that different anti-collision techniques should be deployed for the different zones. Yellow Hat: In this scenario, the thinker who wears the Yellow hat points out the advantage of the selected anti-collision technique and why it is necessary to keep 147

174 CHAPTER 6. CONCEPTUAL SELECTIVE TECHNIQUE MANAGEMENT the current decision. The thinker has decided on deploying only PCT, since it is simple to order one lot of hardware and software from the same supplier, and to avoid unnecessary procedures and time frames for implementation. Black Hat: Logically, at the unloading zone (zone one), trucks usually arrive from the same company/suplier. In addition, at zone three where tagged items are moved along the conveyer belt, realistically it is impossible to have more than one hundred cases of alcohol sitting on the belt. Thus, the Black hat thinker decided that different anti-collision algorithm must be deployed at both zones one and three. Green Hat: The Green hat wearer agrees with the Black and Red hat wearers since the Green hat takes care of the old ideas and presents alternatives. However, because the options are strictly limited to either deterministic or probabilistic for each zone, Green hat decides on applying Joined Q-ary Tree to zone one instead of PCT; and also suggests PCT no group for zone three, as not many tags will be present on the conveyer belt. Blue Hat: The thinker of the Blue hat will be thinking about thinking and set objectives for each section. For the anti-collision selective process, the thinker of the Blue hat is to deicide who to put on each hat and what is the main scope of the overall selective process. From the overall analysis, the Blue hat has decided to employ both types of anti-collisions and to apply them to different zones. According to the Six Thinking Hats Strategy, GTE should employ a PCT group at zone two and zone four only, since these two zones are involved with arbitrary goods. The Six Thinking Hats strategy has recommended that the Joined Q-ary Tree is deployed instead of the PCT group at zone one because arriving items from supplier are usually delivered from the same supplier. At zone three, it is recommended that the PCT no group is implemented since this location is involved with arbitrary goods, but does not involve a numerous number of tags Solution Phase For a complex scenario such as GTE, complex kinds of thinking are needed, in order to obtain the optimal result from each anti-collision algorithm. The Six Thinking Hats can correctly identify the best algorithms for all four zones, as shown in Table 6.5. The Novel Decision Tree, however, can only obtain correct algorithms for zones two and four. This is because the Novel Decision Tree only takes into consideration the facts and figures without any concern for special circumstances unforeseen, or for specific environmental requirements. Thus, for zone one and zone three where the information provided is ambiguous, the Novel Decision Tree cannot correctly identify the suitable algorithm. 148

175 6.4. APPLICABILITY OF ANTI-COLLISION TECHNIQUES IN REAL WORLD SCENARIO Table 6.5: Selected Anti-Collision Method using Decision Tree and Six Thinking Hats Strategies. Joined Q-ary Tree = JQT; PCT Group = PCT-G; PCT no Group = PCT-NG Novel Decision Tree Six Thinking Hats Location JQT PCT-G PCT-NG JQT PCT-G PCT-NG Zone One X X X X Zone Two X X X X Zone Three X X X X Zone Four X X X X From the investigation, we have discovered that different anti-collision method has advantage over the other in some cases. We found that by correctly identifying the most suitable anti-collision technique, using our proposed Novel Decision Tree Strategy and Six Thinking Hats Strategy, the data collection process can be improved; and the chain reaction toward the next level of data transformation, aggregation, and event processing can be decreased. Thus, it is important that the correct type of anti-collision algorithm is applicable to different scenarios. The next two sections explain the applicabilities of both Joined Q-ary Tree and Probabilistic Cluster-Based Technique, in accordance with sample real world scenarios. 6.4 Applicability of Anti-Collision Techniques in Real World Scenario This section demonstrates sample scenarios, in which our proposed Joined Q-ary Tree and Probabilistic Cluster-Based Technique, can be applied. The first scenario is a Wine Warehouse Tag-and-Ship Scenario, where a deterministic Joined Q-ary Tree can be utilised as an adequate anti-collision scheme. The Joined Q-ary Tree is deployed within the RFID reader device and communicates with RFID passive tags presented within the interrogation zone. The second scenario is a Document Warehouse Scenario, where a Probabilistic Cluster-Based Technique is exploited as an anti-collision approach. Both scenarios are explained in detail, in the following subsections Wine Warehouse Tag-and-Ship Scenario The Wine Warehouse Tag-and-Ship Scenario is where a business decided to select its own encoding scheme, reader type, tag type, EPC pattern, and middleware vendor. The warehouse involves huge amount of inventory with many pallets of goods traveling trough the supply chain. The tags must be printed according to selected EPC pattern within the organisation. Then, after all tagged items are deployed around the warehouse, they can be tracked and traced easily. The benefit of utilising a RFID system toward this scenario is that, inventories within the warehouse can be tracked in real-time automatically, which minimised cost of manual labour and maximised the visibility of items. In the 149

176 CHAPTER 6. CONCEPTUAL SELECTIVE TECHNIQUE MANAGEMENT Wine Warehouse Tag-and-Ship Scenario, the Joined Q-ary Tree is employed as an anticollision, because the scenario involves RFID data from the same source and has massive tag movement. Figure 6.9: Wine Warehouse Tag-and-Ship Scenario Figure 6.9 illustrates a sample Wine Warehouse Tag-and-Ship Scenario, and the applicability of Joined Q-ary Tree, as an anti-collision approach. The figure shows sample set up including the environment area, items, readers, tags, middleware, and the operation process. The detail of Figure 6.9 is explained as follows: Complex Event Processing - Complex Event Processing (CEP) consists of many events that happen across all layers of a RFID organisation. It identifies the most meaningful events within massive amount of events; analyses their impact; and takes subsequent action in real time. Complex Event Processing refers to processed states, the changes of state, or time. An event may be observed as a change of state with any physical or logical, or otherwise other condition in a RFID system. 150

177 6.4. APPLICABILITY OF ANTI-COLLISION TECHNIQUES IN REAL WORLD SCENARIO Device Service Provider Interface (DSPI) - A device service provider interface (DSPI) component can provide a uniform manner to communicate and manage a RFID device. The DSPI component can include a receiver component that receives one or more RFID server data, and RFID device data. The interface can be defined to handle discovery, configuration, communication, and connection management. RFID Printer - A printing device used to write data to a RFID tag that can also print any graphics, barcodes and text onto the label. RFID Reader - A transmitter or receiver that reads the contents of RFID tags in the proximity. The maximum distance between the reader s antenna and the tag varies, depending on the type of reader and tag, and the RFID application. Middleware - RFID middleware is placed between the reader and the enterprise applications and database systems. According to EPC specifications, middleware applications handle the tag and reader data originating from different sources. RFID middleware fulfills and supports the unique EPC data of the items that are being tagged and aggregate massive amount of data before it reaches the enterprise applications. Data is routed and converted into formats, as per the requirements of the various applications. Database - A database is a collection of information that is organised and stored in a computer system so that it can easily be accessed, managed, and updated. In one view, databases can be classified according to types of content: bibliographic, full-text, numeric, and images. Application - An application is a computer software designed to perform a specific function, singular or multiple related specific tasks, directly for end user. Examples of application programs include database programs, development tools, and communication programs. Application programs use the services of the computer s operating system and other supporting programs. Untagged Item - Item newly arrived and has not yet been tagged. Tagged Item - Item that has been tagged by a RFID printer. The detailed process of the Wine Warehouse Tag-and-Ship Scenario are as follows: After the cases have been picked and are ready to be tagged, an operator uses a Tag Printing application and an RFID Printer to print and apply tags onto each case of wine. The Tag Printing application stores the information about the tags that it prints in the CasesTagged database. When the user clicks Print Tag in the Tag Printing application, the Print Tag command is processed in the following way: The Print Tag command with the tag information communicates with the event processing engine. 151

178 CHAPTER 6. CONCEPTUAL SELECTIVE TECHNIQUE MANAGEMENT The event processing engine sends the Print Tag command to the Printer Device s provider, and the Print Tag command is queued in the Command Processing Thread of the provider. The Command Processing Thread converts the tag information from the DSPI standard format to the proprietary tag format of the Printer Device s provider. The tag information including EPC number is transferred to the RFID Printer, using the Printer Device Host protocol. The RFID Printer prints the tag, and the tag is applied to the case of wine, either manually or by using an automated system. The Tag Printing application puts the information about the tag that was printed in the CasesTagged database. The tagged cases move along a conveyor belt toward the exit where a RFID Reader is located. The Reader reads the tag on each case and sends all tag information through middleware vendor. The middleware then stores each received tag in the CasesShipped database. The information in this database is used for business analyses, such as comparing information in the CasesTagged database and CasesShipped database. This ensures that all cases of wine being shipped to the customer have been tagged. The tag information is sent to the reader s provider and is processed in the following way: The provider has a translating mechanism that converts the tag information from the Reader Device s provider proprietary format, into the DSPI standard format. During the data capturing process, the Reader Device performs the Joined Q-ary Tree anti-collision protocol to reduce tag collision transmission, and also perform other filtering algorithms to eliminate data errors. The tag-read event travels to the shipping process in RFID services, where the event handler is pre-coded to send the tag information to the CasesShipped database. The CasesShipped database and CasesTagged database can be compared, to check if all wine cases that entered the warehouse have reached the exit and have been forwarded for shipping Document Warehouse Scenario The Document Warehouse Scenario shows an example of an enterprise that receives items from various vendors. The document warehouse involves huge amount of inventory where data source can be from various locations and each tag requires specific encoding scheme, but the enterprise does not produce its own printed tags. The benefit of utilising a RFID 152

6.4. APPLICABILITY OF ANTI-COLLISION TECHNIQUES IN REAL WORLD SCENARIO system toward this scenario is that inventories within the warehouse can be tracked in realtime automatically, which minimise

179 6.4. APPLICABILITY OF ANTI-COLLISION TECHNIQUES IN REAL WORLD SCENARIO system toward this scenario is that inventories within the warehouse can be tracked in realtime automatically, which minimise cost of manual labour and maximise the visibility of items. In the Document Warehouse Scenario, the Probabilistic Cluster-Based Technique is employed as an anti-collision, since the scenario involves RFID data from different sources and has massive tag movement. Figure 6.10: Document Warehouse Scenario Figure 6.10 illustrates a sample Document Warehouse Scenario and the applicability of Probabilistic Cluster-Based Technique as an anti-collision method. The figure shows sample set up, including the environment area, items, readers, tags, middleware, and the operation process. The definition of Complex Event Processing, DSPI, RFID Reader, Middleware, and Database, were explained from the previous subsection. The detailed process of the Document Warehouse Scenario are as follows: The Document Warehouse Scenario does not involve printing process as in the Wine Warehouse Tag-and-Ship Scenario. Several inventories, which have already been tagged, 153

A RFID Explicit Tag Estimation Scheme for Dynamic Framed- Slot ALOHA Anti-Collision

A RFID Explicit Tag Estimation Scheme for Dynamic Framed- Slot ALOHA Anti-Collision Author Pupunwiwat, Prapassara, Stantic, Bela Published 2010 Conference Title 6th International Conference on Wireless