Tag cloud generation for results of multiple keywords queries
|
|
- Calvin Webster
- 5 years ago
- Views:
Transcription
1 Tag cloud generation for results of multiple keywords queries Martin Leginus, Peter Dolog and Ricardo Gomes Lage IWIS, Department of Computer Science, Aalborg University
2 What tag clouds are? Tag cloud is a visual retrieval interface depicting the most important terms of a dataset. Tag clouds build on top of the entire dataset or query based tag clouds.
3 Tag clouds build on top of the entire dataset. What tag clouds are?
4 Query based tag clouds. What tag clouds are?
5 Motivation It is motivated by personalization tasks, surveillance systems and information retrieval tasks defined with multiple keywords.
6 Techniques Most Frequent Tags from Corpus (MFTC) Most Frequent Tags from Query Result Set (POP) The most frequent topics within the system are propagated to the tag cloud. The tag cloud does not cover other not so frequently represented topics which could be relevant for the user. Term frequency inverse document frequency selection (TFIDF) For each tag from the documents ( ) that is associated with the query keywords, tf idf is computed. These values are aggregated and sorted in the descending order. No consideration of semantic similarities between tags.
7 Techniques Max coverage selection (COV) Maximization of coverage and minimization of overlap between tag clouds tags. The optimization of coverage might result into the generation of tag clouds that contain terms with high coverage but are irrelevant for the specific user's information retrieval goal.
8 Graph based techniques 1. Tag space transformed into a graph. 1. Calculate a tag pair co occurence using Jaccard similarity for all tags. 2. When similarity for a tag pair is greater than a predefined threshold α, we consider such tags as similar. 3. Each similar tag pair is transformed into two directed edges t1 t2 and t2 t1 2. Graph based methods for relevance estimation The algorithms rank an importance of a tag t with respect to the query keywords T I (t Tq) Top k most relevant tags are selected for the final tag cloud
9 Graph based techniques 1. Tag space transformed into a graph. Calculate a tag pair co occurence using Jaccard similarity for all tags. Samuel L. Jackson assigned to Goodfellas (1990),Pulp Fiction (1994),Die Hard: With a Vengeance (1995),Kill Bill: Vol. 2 (2004) Tarantino assigned to Reservoir Dogs (1992),Pulp Fiction (1994),Four Rooms(1995), Jackie Brown (1997),Kill Bill: Vol. 1 (2003),Kill Bill: Vol. 2 (2004) Cooccurring at Pulp Fiction and Kill Bill: Vol. 2 JAC(Samuel L. Jackson;Tarantino) =
10 Graph based techniques 1. Tag space transformed into a graph. 1. Calculate a tag pair co occurence using Jaccard similarity for all tags. 2. When similarity for a tag pair is greater than a predefined threshold α, we consider such tags as similar. 3. Each similar tag pair is transformed into two directed edges t1 t2 and t2 t1 2. Graph based methods for relevance estimation The algorithms rank an importance of a tag t with respect to the query keywords T I (t ) Top k most relevant tags are selected for the final tag cloud
11 Graph based techniques Graph based methods for relevance estimation Distance based approaches computationally expensive Stochastic approaches simulation of a random traversal of the graph In this work, we focus only on stochastic approaches
12 Stochastic Graph based techniques Measuring importance of nodes in the graph through the simulation of a stochastic process i.e., random traversing of the graph. The transition probability from a node for all nodes that have an ingoing edge from. is defined as
13 Stochastic Graph based techniques Bruce Willis Reservoir dogs Unbreakable Samuel L. Jackson Quentin Tarantino Pulp Fiction Kill Bill vol. 2
14 Stochastic Graph based techniques Bruce Willis Starts a random walk from Pulp Fiction node 5 options of transitions Reservoir dogs Unbreakable Samuel L. Jackson Quentin Tarantino Pulp Fiction Kill Bill vol. 2
15 Stochastic Graph based techniques Bruce Willis Jumped to Bruce Willis tag only three options of transitions. Reservoir dogs Unbreakable Samuel L. Jackson Quentin Tarantino Pulp Fiction Kill Bill vol. 2
16 Stochastic Graph based techniques Bruce Willis Jumped to Unbreakable tag only three options of transitions. Reservoir dogs Unbreakable Samuel L. Jackson Quentin Tarantino Pulp Fiction Kill Bill vol. 2
17 Stochastic Graph based techniques Bruce Willis The random walk after some time converges if you will run it longer the time a token stays at a certain node will be the same. Reservoir dogs Unbreakable XY Samuel L. Jackson Quentin Tarantino Pulp Fiction Kill Bill vol. 2
18 Stochastic Graph based techniques At each step of the random walk, it is possible to perfom a random restart which starts the random walk again from one of the root noods query tags. Bruce Willis Reservoir dogs Unbreakable XY Samuel L. Jackson Quentin Tarantino Pulp Fiction Kill Bill vol. 2
19 Pagerank with priors Relative importance to a query tag is introduced through the vector of prior probabilities = { } A random surfer is assured with a back probability ) The resulting ranks biased towards are considered as definition of importance after convergence i.e.; I(t The method requires to set up several parameters such as the back probability and prior probabilities with respect to a specific dataset.
20 HITS with priors The same prior probabilities probability = { } and a back ) Where: )
21 K step Markov Chain This method differs in the implementation of a random surfer model. Implement with a path length limitation determines how often we jump back to root nodes..... Relative importance to root nodes is introduced through a vector of prior probabilities
22 Prior probabilities Uniform prior distribution results into inclusion of irrelevant tags into the final tag cloud. Relative popularity of query tags
23
24 Datasets Bibsonomy contains 206k items, 51k tags and 466k tagging posts. Movielens contains 16k tags, 7k movies and 95k tagging posts. Delicious contains 187k tags, 355k bookmarks and 2046k tagging posts.
25 Synthetic metrics Synthetic metrics express a quality of tags selection process (Venetis 2011). Relevance of : Expresses how relevant the tags in are to the query tags. We compute an average relevance of all tags from in the following way: The metric captures to which extent resources associated with tag cloud tags overlap with the resources retrieved by the query tags. We do not consider Coverage as this metrics might be misleading tags with high coverage can be irrelevant and not enough discriminative for the retrieval tasks
26 A set of tags issued as a query Documents associated with the tag Documents retrieved by the keywords query Documents associated with the tag Tag T is more relevant than T The tag T can be perceived as more specific subtopic of the documents returned by the query T / more discriminative for filtering purposes.
27 Results Bibsonomy
28 Results Movielens
29 Results Delicious
30 Limitations The methods do not perform that well on top of datasets with the long tail distribution of tags. Caused by the way the tag space is transformed into a graph structure.
31 Conclusions The graph based methods perform the best at the Movielens and the Bibsonomy datasets. The proposed extension of the setting of prior probabilities for the random walk based algorithms. The methods do not perform well at the Delicious dataset.
32 Future work Propose an enhanced graph creation Enhance tags selection to generate more diverse and novel tag clouds. Extend synthetic metrics that will better capture diversity and novelty of tag clouds
33 Questions
34
35 Possible questions: Why there is a need to adjust a prior probabilities? When a rarely used tag is chosen as a query tag, such tag does not co occure with many tags. Therefore, there are not many edges connecting this graph node with other nodes. A random traversal of the graph initiated from the rarely used tag/node might reach not important/relevant nodes (tags). Consequently, it results into an inclusion of irrelevant tags into the tag cloud. We verified this assumption by series of preliminary evaluations.
36 Possible questions: Why Delicious is different? There are many very frequent tags in the dataset, i.e., almost 20 tags that were assigned at least times, almost 500 tags that were placed by users at least 1000 times. On the other hand, there are tags utilized less than 10 times. The underlying co occurrence graph links very frequent tags with very rarely used tags. It results into the inclusion of more frequent tags into tag clouds. Such inclusion causes lower relevance.
Methodologies for Improved Tag Cloud Generation with Clusterin
Methodologies for Improved Tag Cloud Generation with Clustering. Martin Leginus, Peter Dolog, Ricardo Lage, and Frederico Durao Department of Computer Science, Aalborg University July, 2012 Agenda Introduction
More informationSOCIAL MEDIA MINING. Behavior Analytics
SOCIAL MEDIA MINING Behavior Analytics Dear instructors/users of these slides: Please feel free to include these slides in your own material, or modify them as you see fit. If you decide to incorporate
More informationEntity Grouping for Accessing Social Streams via Word Clouds
Entity Grouping for Accessing Social Streams via Word Clouds Martin Leginus 1, Leon Derczynski 2, and Peter Dolog 1 1 Department of Computer Science, Aalborg University, Selma Lagerlofs Vej 300, 9200 Aalborg,
More informationHybridRank: Ranking in the Twitter Hybrid Networks
HybridRank: Ranking in the Twitter Hybrid Networks Jianyu Li Department of Computer Science University of Maryland, College Park jli@cs.umd.edu ABSTRACT User influence in social media may depend on multiple
More informationOn utility of temporal embeddings for skill matching. Manisha Verma, PhD student, UCL Nathan Francis, NJFSearch
On utility of temporal embeddings for skill matching Manisha Verma, PhD student, UCL Nathan Francis, NJFSearch Skill Trend Importance 1. Constant evolution of labor market yields differences in importance
More informationThe Science of Social Media. Kristina Lerman USC Information Sciences Institute
The Science of Social Media Kristina Lerman USC Information Sciences Institute ML meetup, July 2011 What is a science? Explain observed phenomena Make verifiable predictions Help engineer systems with
More information2016 U.S. PRESIDENTIAL ELECTION FAKE NEWS
2016 U.S. PRESIDENTIAL ELECTION FAKE NEWS OVERVIEW Introduction Main Paper Related Work and Limitation Proposed Solution Preliminary Result Conclusion and Future Work TWITTER: A SOCIAL NETWORK AND A NEWS
More informationTHE internet has changed different aspects of our lives, job
1 Recommender Systems for IT Recruitment João Almeida and Luís Custódio Abstract Recruitment processes have increasingly become dependent on the internet. Companies post job opportunities on their websites
More informationEvaluating Workflow Trust using Hidden Markov Modeling and Provenance Data
Evaluating Workflow Trust using Hidden Markov Modeling and Provenance Data Mahsa Naseri and Simone A. Ludwig Abstract In service-oriented environments, services with different functionalities are combined
More informationM-Eco enhanced Adaptation Service (D5.2) Dolog, Peter; Durao, Frederico Araujo; Lage, Ricardo Gomes; Leginus, Martin; Pan, Rong
Aalborg Universitet M-Eco enhanced Adaptation Service (D5.2) Dolog, Peter; Durao, Frederico Araujo; Lage, Ricardo Gomes; Leginus, Martin; Pan, Rong Publication date: 2012 Document Version Accepted author
More informationGlossary Adjacency matrix Adjective Orientation Similarity Aspect coverage Bipartite networks CAO Collaborative filtering Complete graph
Glossary Adjacency matrix The adjacency matrix is a matrix whose rows and columns represent the graph vertices. A matrix entry at position (i, j) contains a 1 or a 0 value according to whether an edge
More informationDiversifying Web Service Recommendation Results via Exploring Service Usage History
\ IEEE TRANSACTIONS ON SERVICES COMPUTING on Volume: PP; Year 2015 Diversifying Web Service Recommendation Results via Exploring Service Usage History Guosheng Kang, Student Member, IEEE, Mingdong Tang,
More informationInfluencer Communities. Influencer Communities. Influencers are having many different conversations
Influencer Communities Influencers are having many different conversations 1 1.0 Background A unique feature of social networks is that people with common interests are following (or friend-ing) similar
More informationOntology-Based Model of Law Retrieval System for R&D Projects
Ontology-Based Model of Law Retrieval System for R&D Projects Wooju Kim Yonsei University 50 Yonsei-ro, Seodaemun-gu, Seoul, Republic of Korea +82-2-2123-5716 wkim@yonsei.ac.kr Minjae Won INNOPOLIS Foundation
More informationThe effect of Product Ratings on Viral Marketing CS224W Project proposal
The effect of Product Ratings on Viral Marketing CS224W Project proposal Stefan P. Hau-Riege, stefanhr@stanford.edu In network-based marketing, social influence is considered in order to optimize marketing
More informationConclusions and Future Work
Chapter 9 Conclusions and Future Work Having done the exhaustive study of recommender systems belonging to various domains, stock market prediction systems, social resource recommender, tag recommender
More information15. Text Data Visualization. Prof. Tulasi Prasad Sariki SCSE, VIT, Chennai
15. Text Data Visualization Prof. Tulasi Prasad Sariki SCSE, VIT, Chennai www.learnersdesk.weebly.com Why Visualize Text? Understanding get the gist of a document Grouping cluster for overview or classifcation
More informationSocial Recommendation: A Review
Noname manuscript No. (will be inserted by the editor) Social Recommendation: A Review Jiliang Tang Xia Hu Huan Liu Received: date / Accepted: date Abstract Recommender systems play an important role in
More informationSI Recommender Systems, Winter 2009
University of Michigan Deep Blue deepblue.lib.umich.edu 2010-02 SI 583 - Recommender Systems, Winter 2009 Sami, Rahul Sami, R. (2010, February 16). Recommender Systems. Retrieved from Open.Michigan - Educational
More informationMining the Social Web. Eric Wete June 13, 2017
Mining the Social Web Eric Wete ericwete@gmail.com June 13, 2017 Outline The big picture Features and methods (Political Polarization on Twitter) Summary (Political Polarization on Twitter) Features on
More informationGenerative Models for Networks and Applications to E-Commerce
Generative Models for Networks and Applications to E-Commerce Patrick J. Wolfe (with David C. Parkes and R. Kang-Xing Jin) Division of Engineering and Applied Sciences Department of Statistics Harvard
More informationMethods and tools for exploring functional genomics data
Methods and tools for exploring functional genomics data William Stafford Noble Department of Genome Sciences Department of Computer Science and Engineering University of Washington Outline Searching for
More informationHashtag-centric Immersive Search on Social Media
Hashtag-centric Immersive Search on Social Media Yuqi Gao, Jitao Sang, Tongwei Ren, Changsheng Xu State Key Laboratory for Novel Software Technology, Nanjing University National Lab of Pattern Recognition,
More informationMonte-Carlo Tree Search
Introduction Selection and expansion References An introduction Jérémie DECOCK May 2012 Introduction Selection and expansion References Introduction 2 Introduction Selection and expansion References Introduction
More informationContext-aware recommendation
Context-aware recommendation Eirini Kolomvrezou, Hendrik Heuer Special Course in Computer and Information Science User Modelling & Recommender Systems Aalto University Context-aware recommendation 2 Recommendation
More informationWorker Skill Estimation from Crowdsourced Mutual Assessments
Worker Skill Estimation from Crowdsourced Mutual Assessments Shuwei Qiang The George Washington University Amrinder Arora BizMerlin Current approaches for estimating skill levels of workforce either do
More informationInfluence Maximization-based Event Organization on Social Networks
Influence Maximization-based Event Organization on Social Networks Cheng-Te Li National Cheng Kung University, Taiwan chengte@mail.ncku.edu.tw 2017/9/18 2 Social Event Organization You may want to plan
More informationAn Analysis Framework for Content-based Job Recommendation. Author(s) Guo, Xingsheng; Jerbi, Houssem; O'Mahony, Michael P.
Provided by the author(s) and University College Dublin Library in accordance with publisher policies. Please cite the published version when available. Title An Analysis Framework for Content-based Job
More informationActive Learning for Conjoint Analysis
Peter I. Frazier Shane G. Henderson snp32@cornell.edu pf98@cornell.edu sgh9@cornell.edu School of Operations Research and Information Engineering Cornell University November 1, 2015 Learning User s Preferences
More informationModeling and Predicting User Interests based on Taxonomy. Makoto Nakatsuji
Modeling and Predicting User Interests based on Taxonomy Makoto Nakatsuji Abstract In the thesis, we analyze user interests based on a domain specific taxonomy. We propose modeling user interests and
More informationTOWARD MORE DIVERSE RECOMMENDATIONS: ITEM RE-RANKING METHODS FOR RECOMMENDER SYSTEMS
TOWARD MORE DIVERSE RECOMMENDATIONS: ITEM RE-RANKING METHODS FOR RECOMMENDER SYSTEMS Gediminas Adomavicius YoungOk Kwon Department of Information and Decision Sciences Carlson School of Management, University
More informationWhat is Word Cloud? Word clouds provide a concise yet fun way to summarize the content of websites or text documents
What is Word Cloud? Word clouds provide a concise yet fun way to summarize the content of websites or text documents In a typical word cloud, tags from a website (or words from a document) are packed into
More informationMicroTrails Comparing Hypotheses about Task Selection on a Crowdsourcing Platform
MicroTrails Comparing Hypotheses about Task Selection on a Crowdsourcing Platform Martin Becker 1 Kathrin Borchert 2 Mathias Hirth 2 Hauke Mewes 1 Andreas Hotho 1,3 Phuoc Tran-Gia 2 DMIR, Computer Sicence,
More informationLeveraging the Social Breadcrumbs
Leveraging the Social Breadcrumbs 2 Social Network Service Important part of Web 2.0 People share a lot of data through those sites They are of different kind of media Uploaded to be seen by other people
More informationAnt Colony Optimization
Ant Colony Optimization Part 2: Simple Ant Colony Optimization Fall 2009 Instructor: Dr. Masoud Yaghini Outline Ant Colony Optimization: Part 2 Simple Ant Colony Optimization (S-ACO) Experiments with S-ACO
More informationNETWORK BASED PRIORITIZATION OF DISEASE GENES
NETWORK BASED PRIORITIZATION OF DISEASE GENES by MEHMET SİNAN ERTEN Submitted in partial fulfillment of the requirements for the degree of Master of Science Thesis Advisor: Mehmet Koyutürk Department of
More informationData Mining in Social Network. Presenter: Keren Ye
Data Mining in Social Network Presenter: Keren Ye References Kwak, Haewoon, et al. "What is Twitter, a social network or a news media?." Proceedings of the 19th international conference on World wide web.
More informationIndexing and Query Processing. What will we cover?
Indexing and Query Processing CS 510 Spring 2010 1 What will we cover? Key concepts and terminology Inverted index structures Organization, creation, maintenance Compression Distribution Answering queries
More informationLioma, Christina Amalia: Part of Speech n-grams for Information Retrieval
Lioma, Christina Amalia: Part of Speech n-grams for Information Retrieval David Nemeskey Data Mining and Search Research Group MTA SZTAKI Data Mining Seminar 2011.05.12. Information Retrieval Goal: return
More informationIndividual and Social Behavior in Tagging Systems
Individual and Social Behavior in Tagging Systems Elizeu Santos-Neto,David Condon,Nazareno Andrade +,Adriana Iamnitchi,Matei Ripeanu Electrical & Computer Engineer University of British Columbia 2332 Mail
More informationA Semi-automated Peer-review System Bradly Alicea Orthogonal Research
A Semi-automated Peer-review System Bradly Alicea bradly.alicea@ieee.org Orthogonal Research Abstract A semi-supervised model of peer review is introduced that is intended to overcome the bias and incompleteness
More informationTrust-Aware Recommender Systems
Mohammad Ali Abbasi, Jiliang Tang, and Huan Liu Computer Science and Engineering, Arizona State University {Ali.Abbasi, Jiliang.Tang, Huan.Liu}@asu.edu Trust-Aware Recommender Systems Chapter 1 Trust-Aware
More informationA Weighted Tag Similarity Measure Based on a Collaborative Weight Model
A Weighted Tag Similarity Measure Based on a Collaborative Weight Model G.R.J.Srinivas Niket Tandon Search and Information Max Planck Institute, Extraction Lab, IIIT Hyderabad, Germany India ntandon@mpi-inf.mpg.de
More informationCascading Behavior in Networks. Anand Swaminathan, Liangzhe Chen CS /23/2013
Cascading Behavior in Networks Anand Swaminathan, Liangzhe Chen CS 6604 10/23/2013 Outline l Diffusion in networks l Modeling diffusion through a network l Diffusion, Thresholds and role of Weak Ties l
More informationTowards Effective and Efficient Behavior-based Trust Models. Klemens Böhm Universität Karlsruhe (TH)
Towards Effective and Efficient Behavior-based Trust Models Universität Karlsruhe (TH) Motivation: Grid Computing in Particle Physics Physicists have designed and implemented services specific to particle
More informationKnowledge-Guided Analysis with KnowEnG Lab
Han Sinha Song Weinshilboum Knowledge-Guided Analysis with KnowEnG Lab KnowEnG Center Powerpoint by Charles Blatti Knowledge-Guided Analysis KnowEnG Center 2017 1 Exercise In this exercise we will be doing
More informationPredicting user rating for Yelp businesses leveraging user similarity
Predicting user rating for Yelp businesses leveraging user similarity Kritika Singh kritika@eng.ucsd.edu Abstract Users visit a Yelp business, such as a restaurant, based on its overall rating and often
More informationFinal Project - Social and Information Network Analysis
Final Project - Social and Information Network Analysis Factors and Variables Affecting Social Media Reviews I. Introduction Humberto Moreira Rajesh Balwani Subramanyan V Dronamraju Dec 11, 2011 Problem
More informationDiscovering Emerging Businesses
Arjun Mathur, Chris van Harmelen, Shubham Gupta Abstract In the last few years, extensive research has been done on user-item graphs in order to enable modern users to easily find interesting new items
More informationA Dynamics for Advertising on Networks. Atefeh Mohammadi Samane Malmir. Spring 1397
A Dynamics for Advertising on Networks Atefeh Mohammadi Samane Malmir Spring 1397 Outline Introduction Related work Contribution Model Theoretical Result Empirical Result Conclusion What is the problem?
More informationThe Emergence of Hypertextual Ecology from Individual Decisions
The Emergence of Hypertextual Ecology from Individual Decisions Miles Efron Steven M. Goodreau Vishal Sanwalani July 23, 2002 Abstract Current World Wide Web (WWW) search engines employ graph-theoretic
More informationImproving Web Service Clustering through Ontology Learning and Context Awareness
Improving Web Service Clustering through Ontology Learning and Context Awareness Banage Thenne Gedara Samantha Kumara A DISSERTATION SUBMITTTED IN FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR
More informationInfluence Maximization on Social Graphs. Yu-Ting Wen
Influence Maximization on Social Graphs Yu-Ting Wen 05-25-2018 Outline Background Models of influence Linear Threshold Independent Cascade Influence maximization problem Proof of performance bound Compute
More informationWE consider the general ranking problem, where a computer
5140 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 54, NO. 11, NOVEMBER 2008 Statistical Analysis of Bayes Optimal Subset Ranking David Cossock and Tong Zhang Abstract The ranking problem has become increasingly
More informationPerseus A Personalized Reputation System
Perseus A Personalized Reputation System Petteri Nurmi Helsinki Institute for Information Technology HIIT petteri.nurmi@cs.helsinki.fi Introduction Internet provides many possibilities for online transactions
More informationMADVERTISER: A SYSTEM FOR MOBILE ADVERTISING IN MOBILE PEER-TO-PEER ENVIRONMENTS
Association for Information Systems AIS Electronic Library (AISeL) PACIS 2014 Proceedings Pacific Asia Conference on Information Systems (PACIS) 2014 MADVERTISER: A SYSTEM FOR MOBILE ADVERTISING IN MOBILE
More informationCorpWiki: A self-regulating wiki to promote corporate collective intelligence through expert peer matching
CorpWiki: A self-regulating wiki to promote corporate collective intelligence through expert peer matching Ioanna Lykourentzou (1), Katerina Papadaki (1), Dimitrios J. Vergados (1), Despina Polemi (2)
More informationPreface to the third edition Preface to the first edition Acknowledgments
Contents Foreword Preface to the third edition Preface to the first edition Acknowledgments Part I PRELIMINARIES XXI XXIII XXVII XXIX CHAPTER 1 Introduction 3 1.1 What Is Business Analytics?................
More informationModeling Heterogeneous User. Churn and Local Resilience of Unstructured P2P Networks
Modeling Heterogeneous User Churn and Local Resilience of Unstructured P2P Networks Zhongmei Yao Joint work with Derek Leonard, Xiaoming Wang, and Dmitri Loguinov Internet Research Lab Department of Computer
More informationUNSUPERVISED KEYWORD EXTRACTION FROM MICROBLOG POSTS VIA HASHTAGS a
Journal of Web Engineering, Vol. 17, No. 1&2 (2018) 093 120 c River Publishers UNSUPERVISED KEYWORD EXTRACTION FROM MICROBLOG POSTS VIA HASHTAGS a LIN LI 1 School of Computer Science & Technology, Wuhan
More informationUncovering the Small Community Structure in Large Networks: A Local Spectral Approach
Uncovering the Small Community Structure in Large Networks: A Local Spectral Approach Yixuan Li 1, Kun He 2, David Bindel 1 and John E. Hopcroft 1 1 Cornell University, USA 2 Huazhong University of Science
More informationScalable Mining of Social Data using Stochastic Gradient Fisher Scoring. Jeon-Hyung Kang and Kristina Lerman USC Information Sciences Institute
Scalable Mining of Social ata using Stochastic Gradient Fisher Scoring Jeon-Hyung Kang and Kristina Lerman USC Information Sciences Institute Information Overload in Social Media 2,500,000,000,000,000,000
More informationComputational Text Analysis for Functional Genomics and Bioinformatics
Computational Text Analysis for Functional Genomics and Bioinformatics Notes Konstantin Tretyakov Abstract The book Computational Text Analysis for Functional Genomics and Bioinformatics by S. Raychaudhuri
More informationSurvival Outcome Prediction for Cancer Patients based on Gene Interaction Network Analysis and Expression Profile Classification
Survival Outcome Prediction for Cancer Patients based on Gene Interaction Network Analysis and Expression Profile Classification Final Project Report Alexander Herrmann Advised by Dr. Andrew Gentles December
More informationAn Effective Recommender System by Unifying User and Item Trust Information for B2B Applications
An Effective Recommender System by Unifying User and Item Trust Information for B2B Applications Qusai Shambour a,b, Jie Lu a, a Lab of Decision Systems and e-service Intelligence, Centre for Quantum Computation
More informationKeyword Extraction using Word Co-occurrence TIR 2010, Bilbao 31 August 2010
Keyword Extraction using Word Co-occurrence TIR 2010, Bilbao 31 August 2010 Christian Wartena (Novay Rogier Brussee (Univ. of Applied Sciences Utrecht, presenter Wout Slakhorst (Novay Problem description
More informationMeasurement and Analysis of OSN Ad Auctions
Measurement and Analysis of OSN Ad Auctions Chloe Kliman-Silver Robert Bell Balachander Krishnamurthy Alan Mislove Northeastern University AT&T Labs Research Brown University Motivation Online advertising
More informationHomophily and Influence in Social Networks
Homophily and Influence in Social Networks Nicola Barbieri nicolabarbieri1@gmail.com References: Maximizing the Spread of Influence through a Social Network, Kempe et Al 2003 Influence and Correlation
More informationOntoNaviERP: Ontology-supported Navigation in ERP Software Documentation
OntoNaviERP: Ontology-supported Navigation in ERP Software Documentation 1,2 and Andreas Wechselberger 1 1 E-Business and Web Science Research Group, Bundeswehr University Munich, Germany 2 STI Innsbruck,
More informationKey Lessons Learned Building Recommender Systems For Large-scale Social Networks
Key Lessons Learned Building Recommender Systems For Large-scale Social Networks 1 The world s largest professional network Over 50% of members are now international 2/sec 165M+ * New members * 34th 90
More informationCreation of a PAM matrix
Rationale for substitution matrices Substitution matrices are a way of keeping track of the structural, physical and chemical properties of the amino acids in proteins, in such a fashion that less detrimental
More informationText Mining. Theory and Applications Anurag Nagar
Text Mining Theory and Applications Anurag Nagar Topics Introduction What is Text Mining Features of Text Document Representation Vector Space Model Document Similarities Document Classification and Clustering
More informationEvaluating Tagging Behavior in Social Bookmarking Systems: Metrics and design heuristics
Evaluating Tagging Behavior in Social Bookmarking Systems: Metrics and design heuristics Umer Farooq 1, Thomas G. Kannampallil 1, Yang Song 2, Craig H. Ganoe 1, John M. Carroll 1, and C. Lee Giles 2 1
More informationA Propagation-based Algorithm for Inferring Gene-Disease Associations
A Propagation-based Algorithm for Inferring Gene-Disease Associations Oron Vanunu Roded Sharan Abstract: A fundamental challenge in human health is the identification of diseasecausing genes. Recently,
More informationPh.D. Defense: Resource Allocation Optimization in the Smart Grid and High-performance Computing Tim Hansen
Ph.D. Defense: Resource Allocation Optimization in the Smart Grid and High-performance Computing Tim Hansen Department of Electrical and Computer Engineering Colorado State University Fort Collins, Colorado,
More informationLatent Semantic Analysis. Hongning Wang
Latent Semantic Analysis Hongning Wang CS@UVa VS model in practice Document and query are represented by term vectors Terms are not necessarily orthogonal to each other Synonymy: car v.s. automobile Polysemy:
More informationThesaurus based Keyword Extraction. Luit Gazendam (Novay) Christian Wartena (Novay) Rogier Brussee (Univ. of Applied Sciences Utrecht)
Thesaurus based Keyword Extraction Luit Gazendam (Novay) Christian Wartena (Novay) Rogier Brussee (Univ. of Applied Sciences Utrecht) Problem description Keywords used for organising and retrieval documents
More informationXPLODIV: An Exploitation-Exploration Aware Diversification Approach for Recommender Systems
Proceedings of the Twenty-Eighth International Florida Artificial Intelligence Research Society Conference XPLODIV: An Exploitation-Exploration Aware Diversification Approach for Recommender Systems Andrea
More informationInferring Social Ties across Heterogeneous Networks
Inferring Social Ties across Heterogeneous Networks CS 6001 Complex Network Structures HARISH ANANDAN Introduction Social Ties Information carrying connections between people It can be: Strong, weak or
More informationAdvanced Job Daimler. Julian Leweling, Daimler AG
Advanced Job Analytics @ Daimler Julian Leweling, Agenda From Job Ads to Knowledge: Advanced Job Analytics @ Daimler About Why KNIME? Our Inspiration Use Case KNIME Walkthrough Application Next steps Advanced
More informationDerek Davis, Gerardo Figueroa, and Yi-Shin Chen
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 47, NO. 6, JUNE 2017 979 SociRank: Identifying and Ranking Prevalent News Topics Using Social Media Factors Derek Davis, Gerardo Figueroa,
More informationLEAST COST SEARCH ALGORITHM FOR THE IDENTIFICATION OF A DNAPL SOURCE
LEAST COST SEARCH ALGORITHM FOR THE IDENTIFICATION OF A DNAPL SOURCE Z. DOKOU and G. F. PINDER Research Center for Groundwater Remediation Design, University of Vermont, Department of Civil and Environmental
More informationExperimental Techniques 2
Experimental Techniques 2 High-throughput interaction detection Yeast two-hybrid - pairwise organisms as machines to learn about organisms yeast, worm, fly, human,... low intersection between repeated
More informationProgress Report: Predicting Which Recommended Content Users Click Stanley Jacob, Lingjie Kong
Progress Report: Predicting Which Recommended Content Users Click Stanley Jacob, Lingjie Kong Machine learning models can be used to predict which recommended content users will click on a given website.
More informationSilvia Calegari, Marco Comerio, Andrea Maurino,
A Semantic and Information Retrieval based Approach to Service Contract Selection Silvia Calegari, Marco Comerio, Andrea Maurino, Emanuele Panzeri, and Gabriella Pasi Department of Informatics, Systems
More informationFinal Project Report CS224W Fall 2015 Afshin Babveyh Sadegh Ebrahimi
Final Project Report CS224W Fall 2015 Afshin Babveyh Sadegh Ebrahimi Introduction Bitcoin is a form of crypto currency introduced by Satoshi Nakamoto in 2009. Even though it only received interest from
More informationA Site Observation Directed Test Pattern Generation Method for Reducing Defective Part Level
A Site Observation Directed Test Pattern Generation Method for Reducing Defective Part Level Michael R. Grimaila, Sooryong Lee, Jennifer Dworak, M. Ray Mercer, and Jaehong Park Department of Electrical
More informationInference and computing with decomposable graphs
Inference and computing with decomposable graphs Peter Green 1 Alun Thomas 2 1 School of Mathematics University of Bristol 2 Genetic Epidemiology University of Utah 6 September 2011 / Bayes 250 Green/Thomas
More informationFinding Similar Tweets and Similar Users by Applying Document Similarity to Twitter Streaming Data
Proc. vol.6,no2,2013,pp.22-30 Schl. ITE Tokai Univ. Vol. xx,no.xx,20xx,pp.xxx -xxx Paper Paper Finding Similar Tweets and Similar Users by Applying Document Similarity to Twitter Streaming Data by Iwao
More informationPredicting ratings of peer-generated content with personalized metrics
Predicting ratings of peer-generated content with personalized metrics Project report Tyler Casey tyler.casey09@gmail.com Marius Lazer mlazer@stanford.edu [Group #40] Ashish Mathew amathew9@stanford.edu
More informationNovartis E2E CM case study
Technical R&D/CHAD CM Unit Novartis E2E CM case study Markus Krumme, CM Unit Head Cambridge, MA September 26, 2016 Continuous Manufacturing at Novartis Basel ~300 m 2 productive area, 2 upstream trains,
More informationMINING SUPPLIERS FROM ONLINE NEWS DOCUMENTS
MINING SUPPLIERS FROM ONLINE NEWS DOCUMENTS Chih-Ping Wei, Department of Information Management, National Taiwan University, Taipei, Taiwan, R.O.C., cpwei@im.ntu.edu.tw Lien-Chin Chen, Department of Information
More informationAutomatic Tagging and Categorisation: Improving knowledge management and retrieval
Automatic Tagging and Categorisation: Improving knowledge management and retrieval 1. Introduction Unlike past business practices, the modern enterprise is increasingly reliant on the efficient processing
More informationMachine Learning: Algorithms and Applications
Machine Learning: Algorithms and Applications Floriano Zini Free University of Bozen-Bolzano Faculty of Computer Science Academic Year 2011-2012 Lecture 4: 19 th March 2012 Evolutionary computing These
More informationEyal Carmi. Google, 76 Ninth Avenue, New York, NY U.S.A. Gal Oestreicher-Singer and Uriel Stettner
RESEARCH NOTE IS OPRAH CONTAGIOUS? THE DEPTH OF DIFFUSION OF DEMAND SHOCKS IN A PRODUCT NETWORK Eyal Carmi Google, 76 Ninth Avenue, New York, NY 10011 U.S.A. {eyal.carmi@gmail.com} Gal Oestreicher-Singer
More informationCLASS/YEAR: II MCA SUB.CODE&NAME: MC7303, SOFTWARE ENGINEERING. 1. Define Software Engineering. Software Engineering: 2. What is a process Framework? Process Framework: UNIT-I 2MARKS QUESTIONS AND ANSWERS
More informationSpatial Information in Offline Approximate Dynamic Programming for Dynamic Vehicle Routing with Stochastic Requests
1 Spatial Information in Offline Approximate Dynamic Programming for Dynamic Vehicle Routing with Stochastic Requests Ansmann, Artur, TU Braunschweig, a.ansmann@tu-braunschweig.de Ulmer, Marlin W., TU
More informationA Systematic Approach to Performance Evaluation
A Systematic Approach to Performance evaluation is the process of determining how well an existing or future computer system meets a set of alternative performance objectives. Arbitrarily selecting performance
More informationABSTRACT. Timetable, Urban bus network, Stochastic demand, Variable demand, Simulation ISSN:
International Journal of Industrial Engineering & Production Research (09) pp. 83-91 December 09, Volume, Number 3 International Journal of Industrial Engineering & Production Research ISSN: 08-4889 Journal
More information