A Smart Tool to analyze the Salary trends of H1-B Workers

Size: px
Start display at page:

Download "A Smart Tool to analyze the Salary trends of H1-B Workers"

Transcription

1 1 A Smart Tool to analyze the Salary trends of H1-B Workers Akshay Poosarla, Ramya Vellore Ramesh Under the guidance of Prof.Meiliu Lu Abstract Limiting the H1-B visas is bad news for skilled workers in U.S. India and many other foreign country workers are going to be hit hard. Many Employers want the skilled workers to stay in U.S and work for lesser wages. We train our model with h1b petitions data set and classify the wages of h1b employees. Wages of h1b visa workers depends on the multiple factors like geographical conditions, Occupational Classification, Job title,soc_code and many more. Our aim is to build a model to classify the salaries of Entry level job positions related to IT sector of the H-1B applicants as low, average and high by considering various factors. We compare the classification performance of Naive Bayes, Support Vector Machines and Decision Trees. We also train our model using Multilinear Regression to predict the salaries as 1(low),2(average),3(high) based on certain threshold. Our analysis show that SVM performed better than Naïve Bayes, Decision Trees. Index Terms H1-B,soc_code,Occupational Classification I. INTRODUCTION The US H-1B visa is a non-immigrant visa that allows US companies to employ graduate level workers in specialty occupations that require theoretical or technical expertise in specialized fields such as in IT, finance, accounting, architecture, engineering, mathematics, science, medicine, etc. For the foreign workers to work in U.S.A. H1-B petitions has to be filed. Labor Condition Application as to be filed with DOL as part of H1-B process to certify that the employer will pay the sponsored H1-B employee higher of the actual wage at the work place or the prevailing wage in the industry. It s difficult to determine the actual wage of the company, usually the employer will look at the prevailing wage to determine the required salary for an H1-B employee. According to the DOL regulations, the actual wage for the particular job at the company is the wage rate the employer pays to other employees with similar experience and qualifications who are performing the same job as the H1-B worker. The employer need to determine whether they have. other employees with the same qualifications performing the same job as the H-1B worker. If so, the wage paid to those workers is the "actual wage." If no other employees are doing the same job as the H-1B worker, then the salary offered to the H-1B worker is the actual wage. Even after calculating the actual wage, employers need to compare the actual wage to the prevailing wage. If the prevailing wage is higher than the actual wage, employers need to pay the H-1B worker the prevailing wage. The "prevailing wage" is either the applicable wage under a collective bargaining agreement or, if there is no union, the average wage paid to workers in a particular occupation in a specific geographic location. We predict and analyze the factors such as job positions, employer and work location on which the determination of wages of H1-B employees is dependent. We predict the wages of H1-B employees as high, average and low by building the models using machine learning algorithms. The paper is structured in the following way: Section 2.presents Literature Survey 3 introduces the data collection, Section 4 talks about different data preprocessing techniques Section 5 gives us Interesting data insights Section 6 presents different machine learning algorithms applied for the model. In Section 7 we talked about different limitations in R Section 8 presents the comparison of the 4 models. Section 9 compares different Models II.LITERATURE SURVEY Text Analysis to predict H1-B wages for year 2012 is mentioned in [1].Decision Trees and Sun Burst View was used to determine the correlation between job_title,employer_name and many other job attributes to predict the wages of H1-B employees by analyzing the text. To predict the job salaries using the dataset provided by Kaggle for a competition, one of the participant used the absolute error and mean square error to predict an absolute value of the salaries. The error was found out by determining how much he missed the actual value by. Four different models was built to predict the salaries by considering different variables in each of the models. [2] III.DATA COLLECTION Collected the H1-B petition dataset from enigma.io website through rest API calls. The dataset contains 647,852

2 2 observations with 41 variables. There are more than 12,000 different employers and 10,000 unique job positions starting from accounts manager to web developers. The below image shows all the 41 different columns. ENGINEER, we changed the SYSTEMS ENGINEER to SYSTEM ENGINEER and changed COMPUTER INFORMATION SYSTEM MANAGER, COMPUTER SYSTEMS MANAGER, COMPUTER AND INFORMATION SYSTEM MANAGER, COMPUTER INFORMATION SYSTEMS MANAGER to COMPUTER AND INFORMATION SYSTEMS MANAGER.As there is no in built method to identify these kind of data in consistencies we have tried to make the data consistent manually. D. Removal of Outliers The highest prevailing wage is and most of the prevailing wage were in the range of 16,000 to 400,000.So removed the outliers before building the model. Outliers are removed by the normalization method [5] where we found the quantile for the data along with the median (50 percentile).the Inter Quartile range is given by difference between Q1 and Q3 where Q1 is 25 percentile and Q3 is 50 percentile. The values above Q3+1.5 IQR and Q1-1.5IQR considered as outliers. Fig 1. Columns in the Data Set before Preprocessing E. Feature Selection IV. DATA PREPROCESSING Removed NA and blank values from the dataset. Since there are more than 10,000 different job positions, we restricted the scope to predict and analyze the salary trends of entry level job positions of the IT sector. We considered 17 entry level job positions for the salary prediction. A. Handling of Categorical Values To train the model using some of the machine learning algorithms, some of the categorical values needs to be converted to numerical values. To convert the categorical values to binary we used the Python Pandas. B. Handling of Missing Values As the data is not very clean there are many places where one value in the column is missing. The first method we have tried is to replace missing values in the numerical column by finding the mean of the column and replaced with the mean. And for categorical variables we have filled the missing values with mode of the coloum[4].later when we train the model with this approach the predictions done by the model are not accurate so we have removed the rows with the missing values C. Data Consistency In order to train any model the data should be consistent across the data set but the data from the H1-B petition set is raw and not consistent. In order to make the data consistent we manually performed the operations. For example in the column+ of employer_state the State New York is given as NY, New York and NewYork.We manually changed this to single form of New York.Simialry these kind of operations are performed all the different states if there are any discrepancies among the names and for the other Colum job_title there are job_tiltes which are same but named differently for different employers such as SYSTEM ENGINEER and SYSTEMS Fig 2.Feature Selection [6] Feature selection is one of important step in the data preprocessing. In this when there m independent variables this will select n <= m independent variables which play a major role in predicting the output class. To select the important columns out of 41 columns to predict the prevailing wage we used the boruta [7] feature selection package. This method is based on Random Forest method with max of 100 Iterations where in each iterations it will decide whether the column is important or not. Interestingly as our data dataset has 41 columns boruta package has run for all the 100 iterations and classified 7 columns as important 5 are average and the remaining columns are not important in predicting the prevailing wage of the employee. We obtained 7 important

3 3 features such as job_title, employer_name, employer_state, agent_attorney_name, agent_attorney_state, soc_code which were considered for predicting the prevailing wage. All the other variables were rejected by boruta. Fig 4.Distribution between H1-B dependent and.. Non H1-B dependent companies The figure 5 shows us the top 20 companies who have filed more number of applications. Infosys is Indian based IT firm which tops the list of number of applications followed by Capgemini and Tata Consultancy Services Limited. The Big 4 companies for computer science are also in the list where Microsoft takes the top positions among these 4 with 5029 applications, followed by Google with 4785 applications and Amazon with 2547 applications. Interestingly Facebook is not the above list. Fig 3 Importance of columns after Boruta Package V. DATA INSIGHTS A company is termed as h1-b dependent company if at least fifteen percent of total employees are foreign workers and vice versa. The fig 4 is graphical comparison of number of petitions filed by the company Vs Company is H1-B dependent or not.n represents the company is not H1-B dependent while Y represent the company is H1-B dependent. Even though number of applications by H1-B dependent companies are higher in number, the number of H1-B dependent companies are just 10 percent of total companies. This shows the domination of number of applications filed by H1-B dependent companies Fig 5 Top 20 Companies with H1-B petitions Before the application is for H1-b the labor condition application should be filed with the department of labor. All the applications with the department of labor are classified into four different types. CERTIFIED: Applications is certified by the department of labor DENIED: Application is denied by the department of labor CERTIFIED_WITHDRAWN: The application is withdrawn by the employer after it is certified by the department of labor. WITHDRAWN: The application is withdrawn by the employer before the department of labor takes decision on it.

4 4 Finally we applied Naïve Bayes Classifier for one against many classes [8] where we divided the prevailing wage into three classes: 1(low) for the wages below 60000,2(average) for wages between 60,000 and 90,000 and 3(high) for the wages above 90,000.We trained the model with one against many i.e wages below 60,000 against wages above 60,000 and below 90,000 and for the wages between 60,000 and 90,000 against the wages above By building this one against many class Naïve Bayes classifier we obtained an accuracy around 83 percent. B.Multilinear Regression Fig 6 Distribution between Applications From the figure 6 we can say that about 85 percent of the total applications filed are certified by the department of labor and rest 20 percent of applications are having the ratio as shown in the graph. To train the model using multilinear regression [9] all the categorical values needs to be changed to factors and assign the labels for it.since the employer name is a categorical value in our dataset and there are more 12,000 unique employer names we were not able to assign labels for each of the 12,000 employer names,r could not allocate a vector of 9.3GB when tried to train the dataset using multilinear regression. So we thought of converting the categorical values to binary using python pandas [10] before training the data with multilinear regression. But when tried to convert the employer names into binary,the csv format of the dataset got corrupted as it was creating 12,000X12,000 square matrix. Basically what python panda s functionality is that when we pass asset of values as input that gives a sparse square matrix of size n X n.as there are around different so finally we had an option of doing random sampling and train the model using multilinear regression. We did a random sampling of the data and trained the model using multilinear regression and the R mean squared error was found to be 0.86.Higher the value of R mean squared error better the model. Fig 7 Salary Distribution of H1-B Employees From the above figure we can say that most of the salaries of the H1-b workers are in the range of which is class 2.The class 1 in the above graph represents number of employees with salary less than and the class 3 represents the employees with salary greater than A. Naïve Bayes Classifier: VI. APPLIED MODELS We removed the outliers and divided the prevailing wage into nine classes and starting from 16,000 to 13,0000 and the width of each of classes was calculated using normalization i.e we used mean and standard deviation to calculate the width of each class and wanted to know in which of the nine classes each of the prevailing wage was falling into. Built the naive Bayes classifier model but accuracy was around 40% which is very less. So we divided the prevailing wage into three classes and trained our data using Naïve Bayes classifier, but still we accuracy was around 50%. C.Support Vector Machines As we know support vector machine is classification algorithm which classifies the two classes. As our target class in this model has three classes we have trained our model using one against many classes. Random sampling of the data is done with Caret package [11] in R and each sample taken is divided into training classes and testing class with ration 80 and 20 respectively. The one difficulty we have encountered after doing the random sampling is the error class not found.as we have around different employer names test set contained the employers which are not present in the training set. So even after random sampling we iterated over the test set and removed the values which are present only in testset.this method helped us in overcoming the problem of new levels present in the test set alone. Trained the model using one against many classes based support vector machines [12] approach and obtained an accuracy of 95.84% D. Decision Trees Trained the model using decision tree [13] machine learning algorithm and obtained an accuracy of 94.94%. When tried to plot the decision tree, the predictor variables with more than 52 levels was not printed. This was the limitation in R. Even

5 5 when tried to plot the decision tree then the tree was visualized but we cannot decode the rules corresponding to the Decision Tree. The decision tree which was printed in R is as below. Fig 8. Fig 8 Decision Tree Plot E.Text Based Analysis Big ml [14]is one of the machine learning website where user can create a login and upload the data set of their interest. Once the dataset is uploaded it provide us different ways to create a data set by selecting the required columns from the original dataset and gives us the option to select the complete data or do the random sampling. Once the data set is created We can analyze different columns based on the visualizations generated.ref: Fig 9 Fig 10. Data path Visualization in Big Ml VII. Limitations of R For a large dataset converting of categorical values into numeric was a big question. Where we have to assign labels for each of the factors. Assigning labels to categorical variable which has 12,000 levels is tedious process. We cannot train the dataset using random forest in R, if the dataset contains the categorical variables with more than 32 levels.it cannot handle categorical predictors with more than 32 categories. When plotted the decision tree, the predictor variables with more than 52 levels was not printed. We could not interpret the rules of the decision tree. Visualization is a limitation in R. It is very difficult to connect the model to the front end and get the input from the user and to pass them to the trained model where it is simple in Python. Fig 9 Column Visualization in Big ML We can train our data set with different models and results will be predicted once the model is trained. We have used text based analysis model to analyze and predict the salaries of H1- B employees. The predicted model is viewed and rules can be decoded from the model. For example if we want the wage corresponding to the software engineer in the Facebook who earns average of 104 k. The tree generated by the model is very UI friendly where you can zoom at a particular node and get the rule corresponding to that node.

6 6 Model Naïve Bayes Classifier (one against many) Support Vector Machines (random sampling) VIII RESULTS Accuracy 83% % Decision Trees 94.94% Multilinear Regression (random sampling) IX CONCLUSION R -squared error:0.56 As per our decision tree text analysis method California is the state with highest average wage and the most important factor in predicting the wage of the employee is Job title followed by location and next to these two comes the Employer name. If there are multiple classes in the target variable Naïve Bayes One against Many classes always gives better results compared to Naïve Bayes Method. [8] S. Rana and A. Singh, "Comparative analysis of sentiment orientation using SVM and Naive Bayes techniques," nd International Conference on Next Generation Computing Technologies (NGCT), Dehradun, 2016, pp [9] otes/401-multreg.pdf [10] [11] t.pdf [12] I. Dilrukshi and K. De Zoysa, "Twitter news classification: Theoretical and practical comparison of SVM against Naive Bayes algorithms," 2013 International Conference on Advances in ICT for Emerging Regions (ICTer), Colombo, 2013,pp [13]J.R. Quinlan, "Induction of Decision Trees" in, Boston:Kluwer Academic Publishers, vol. 1, pp , [14] Decision Tree gives us very good result but if we have more factors and levels it is difficult to decode rules Text analysis for this data set worked pretty well as we can infer more results and rule from the data X REFERENCES [1]. [2] 2.pdf [4] P. Khongchai and P. Songmuang, "Improving students'motivation to study using salary prediction system," th International Joint Conference on Computer Science and Software Engineering (JCSSE), Khon Kaen, 2016, pp [5]Z. J. Kovacic, "Early Prediction of Student Success: Mining Students Enrolment Data", Proceedings of Informing Science & IT Education Conference (InSITE), [6] G. Forman, "An Extensive Empirical Study of Feature Selection Metrics for Text Classification", Journal of Machine Learning Research, vol. 3, pp , [7]

Understanding General Trends in Permanent Visa Applications and Predicting Visa Decisions using SAS Enterprise Miner.

Understanding General Trends in Permanent Visa Applications and Predicting Visa Decisions using SAS Enterprise Miner. Understanding General Trends in Permanent Visa Applications and Predicting Visa Decisions using SAS Enterprise Miner. ARUN TEJA BAIREDDLAPALLI KRISHNA REDDY OKLAMOHA STATE UNIVERSITY Contents ABSTRACT...

More information

Who Is Likely to Succeed: Predictive Modeling of the Journey from H-1B to Permanent US Work Visa

Who Is Likely to Succeed: Predictive Modeling of the Journey from H-1B to Permanent US Work Visa Who Is Likely to Succeed: Predictive Modeling of the Journey from H-1B to Shibbir Dripto Khan ABSTRACT The purpose of this Study is to help US employers and legislators predict which employees are most

More information

Progress Report: Predicting Which Recommended Content Users Click Stanley Jacob, Lingjie Kong

Progress Report: Predicting Which Recommended Content Users Click Stanley Jacob, Lingjie Kong Progress Report: Predicting Which Recommended Content Users Click Stanley Jacob, Lingjie Kong Machine learning models can be used to predict which recommended content users will click on a given website.

More information

Applying Regression Techniques For Predictive Analytics Paviya George Chemparathy

Applying Regression Techniques For Predictive Analytics Paviya George Chemparathy Applying Regression Techniques For Predictive Analytics Paviya George Chemparathy AGENDA 1. Introduction 2. Use Cases 3. Popular Algorithms 4. Typical Approach 5. Case Study 2016 SAPIENT GLOBAL MARKETS

More information

Experiences in the Use of Big Data for Official Statistics

Experiences in the Use of Big Data for Official Statistics Think Big - Data innovation in Latin America Santiago, Chile 6 th March 2017 Experiences in the Use of Big Data for Official Statistics Antonino Virgillito Istat Introduction The use of Big Data sources

More information

Big Data Mining in Twitter using Python Clients

Big Data Mining in Twitter using Python Clients University of New Orleans ScholarWorks@UNO Innovate UNO InnovateUNO Fall 2017 Big Data Mining in Twitter using Python Clients Sanjiv Pradhanang University of New Orleans Follow this and additional works

More information

POST GRADUATE PROGRAM IN DATA SCIENCE & MACHINE LEARNING (PGPDM)

POST GRADUATE PROGRAM IN DATA SCIENCE & MACHINE LEARNING (PGPDM) OUTLINE FOR THE POST GRADUATE PROGRAM IN DATA SCIENCE & MACHINE LEARNING (PGPDM) Module Subject Topics Learning outcomes Delivered by Exploratory & Visualization Framework Exploratory Data Collection and

More information

FINAL PROJECT REPORT IME672. Group Number 6

FINAL PROJECT REPORT IME672. Group Number 6 FINAL PROJECT REPORT IME672 Group Number 6 Ayushya Agarwal 14168 Rishabh Vaish 14553 Rohit Bansal 14564 Abhinav Sharma 14015 Dil Bag Singh 14222 Introduction Cell2Cell, The Churn Game. The cellular telephone

More information

Data Mining Applications with R

Data Mining Applications with R Data Mining Applications with R Yanchang Zhao Senior Data Miner, RDataMining.com, Australia Associate Professor, Yonghua Cen Nanjing University of Science and Technology, China AMSTERDAM BOSTON HEIDELBERG

More information

Predicting Customer Purchase to Improve Bank Marketing Effectiveness

Predicting Customer Purchase to Improve Bank Marketing Effectiveness Business Analytics Using Data Mining (2017 Fall).Fianl Report Predicting Customer Purchase to Improve Bank Marketing Effectiveness Group 6 Sandy Wu Andy Hsu Wei-Zhu Chen Samantha Chien Instructor:Galit

More information

Drive Better Insights with Oracle Analytics Cloud

Drive Better Insights with Oracle Analytics Cloud Drive Better Insights with Oracle Analytics Cloud Thursday, April 5, 2018 Speakers: Jason Little, Sean Suskind Copyright 2018 Sierra-Cedar, Inc. All rights reserved Today s Presenters Jason Little VP of

More information

PREDICTING EMPLOYEE ATTRITION THROUGH DATA MINING

PREDICTING EMPLOYEE ATTRITION THROUGH DATA MINING PREDICTING EMPLOYEE ATTRITION THROUGH DATA MINING Abbas Heiat, College of Business, Montana State University, Billings, MT 59102, aheiat@msubillings.edu ABSTRACT The purpose of this study is to investigate

More information

Applications of Machine Learning to Predict Yelp Ratings

Applications of Machine Learning to Predict Yelp Ratings Applications of Machine Learning to Predict Yelp Ratings Kyle Carbon Aeronautics and Astronautics kcarbon@stanford.edu Kacyn Fujii Electrical Engineering khfujii@stanford.edu Prasanth Veerina Computer

More information

New Customer Acquisition Strategy

New Customer Acquisition Strategy Page 1 New Customer Acquisition Strategy Based on Customer Profiling Segmentation and Scoring Model Page 2 Introduction A customer profile is a snapshot of who your customers are, how to reach them, and

More information

Using decision tree classifier to predict income levels

Using decision tree classifier to predict income levels MPRA Munich Personal RePEc Archive Using decision tree classifier to predict income levels Sisay Menji Bekena 30 July 2017 Online at https://mpra.ub.uni-muenchen.de/83406/ MPRA Paper No. 83406, posted

More information

Data Science Training Course

Data Science Training Course About Intellipaat Intellipaat is a fast-growing professional training provider that is offering training in over 150 most sought-after tools and technologies. We have a learner base of 600,000 in over

More information

From Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques. Full book available for purchase here.

From Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques. Full book available for purchase here. From Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques. Full book available for purchase here. Contents List of Figures xv Foreword xxiii Preface xxv Acknowledgments xxix Chapter

More information

Restaurant Recommendation for Facebook Users

Restaurant Recommendation for Facebook Users Restaurant Recommendation for Facebook Users Qiaosha Han Vivian Lin Wenqing Dai Computer Science Computer Science Computer Science Stanford University Stanford University Stanford University qiaoshah@stanford.edu

More information

Data Analytics on a Yelp Data Set. Maitreyi Tata. B.Tech., Gitam University, India, 2015 A REPORT

Data Analytics on a Yelp Data Set. Maitreyi Tata. B.Tech., Gitam University, India, 2015 A REPORT Data Analytics on a Yelp Data Set by Maitreyi Tata B.Tech., Gitam University, India, 2015 A REPORT submitted in partial fulfillment of the requirements for the degree MASTER OF SCIENCE Department of Computer

More information

Airbnb Capstone: Super Host Analysis

Airbnb Capstone: Super Host Analysis Airbnb Capstone: Super Host Analysis Justin Malunay September 21, 2016 Abstract This report discusses the significance of Airbnb s Super Host Program. Based on Airbnb s open data, I was able to predict

More information

Predicting Restaurants Rating And Popularity Based On Yelp Dataset

Predicting Restaurants Rating And Popularity Based On Yelp Dataset CS 229 MACHINE LEARNING FINAL PROJECT 1 Predicting Restaurants Rating And Popularity Based On Yelp Dataset Yiwen Guo, ICME, Anran Lu, ICME, and Zeyu Wang, Department of Economics, Stanford University Abstract

More information

Strength in numbers? Modelling the impact of businesses on each other

Strength in numbers? Modelling the impact of businesses on each other Strength in numbers? Modelling the impact of businesses on each other Amir Abbas Sadeghian amirabs@stanford.edu Hakan Inan inanh@stanford.edu Andres Nötzli noetzli@stanford.edu. INTRODUCTION In many cities,

More information

Predicting Airbnb Bookings by Country

Predicting Airbnb Bookings by Country Michael Dimitras A12465780 CSE 190 Assignment 2 Predicting Airbnb Bookings by Country 1: Dataset Description For this assignment, I selected the Airbnb New User Bookings set from Kaggle. The dataset is

More information

Business Analytics & Data Mining Modeling Using R Dr. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee

Business Analytics & Data Mining Modeling Using R Dr. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee Business Analytics & Data Mining Modeling Using R Dr. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee Lecture - 02 Data Mining Process Welcome to the lecture 2 of

More information

Machine Learning Models for Sales Time Series Forecasting

Machine Learning Models for Sales Time Series Forecasting Article Machine Learning Models for Sales Time Series Forecasting Bohdan M. Pavlyshenko SoftServe, Inc., Ivan Franko National University of Lviv * Correspondence: bpavl@softserveinc.com, b.pavlyshenko@gmail.com

More information

Data Visualization and Improving Accuracy of Attrition Using Stacked Classifier

Data Visualization and Improving Accuracy of Attrition Using Stacked Classifier Data Visualization and Improving Accuracy of Attrition Using Stacked Classifier 1 Deep Sanghavi, 2 Jay Parekh, 3 Shaunak Sompura, 4 Pratik Kanani 1-3 Students, 4 Assistant Professor 1 Information Technology

More information

Preface to the third edition Preface to the first edition Acknowledgments

Preface to the third edition Preface to the first edition Acknowledgments Contents Foreword Preface to the third edition Preface to the first edition Acknowledgments Part I PRELIMINARIES XXI XXIII XXVII XXIX CHAPTER 1 Introduction 3 1.1 What Is Business Analytics?................

More information

SAP Predictive Analytics Suite

SAP Predictive Analytics Suite SAP Predictive Analytics Suite Tania Pérez Asensio Where is the Evolution of Business Analytics Heading? Organizations Are Maturing Their Approaches to Solving Business Problems Reactive Wait until a problem

More information

Convex and Non-Convex Classification of S&P 500 Stocks

Convex and Non-Convex Classification of S&P 500 Stocks Georgia Institute of Technology 4133 Advanced Optimization Convex and Non-Convex Classification of S&P 500 Stocks Matt Faulkner Chris Fu James Moriarty Masud Parvez Mario Wijaya coached by Dr. Guanghui

More information

CSE 255 Lecture 3. Data Mining and Predictive Analytics. Supervised learning Classification

CSE 255 Lecture 3. Data Mining and Predictive Analytics. Supervised learning Classification CSE 255 Lecture 3 Data Mining and Predictive Analytics Supervised learning Classification Last week Last week we started looking at supervised learning problems Last week We studied linear regression,

More information

Predicting the Odds of Getting Retweeted

Predicting the Odds of Getting Retweeted Predicting the Odds of Getting Retweeted Arun Mahendra Stanford University arunmahe@stanford.edu 1. Introduction Millions of people tweet every day about almost any topic imaginable, but only a small percent

More information

An Implementation of genetic algorithm based feature selection approach over medical datasets

An Implementation of genetic algorithm based feature selection approach over medical datasets An Implementation of genetic algorithm based feature selection approach over medical s Dr. A. Shaik Abdul Khadir #1, K. Mohamed Amanullah #2 #1 Research Department of Computer Science, KhadirMohideen College,

More information

Analytics for Banks. September 19, 2017

Analytics for Banks. September 19, 2017 Analytics for Banks September 19, 2017 Outline About AlgoAnalytics Problems we can solve for banks Our experience Technology Page 2 About AlgoAnalytics Analytics Consultancy Work at the intersection of

More information

Predicting Yelp Ratings From Business and User Characteristics

Predicting Yelp Ratings From Business and User Characteristics Predicting Yelp Ratings From Business and User Characteristics Jeff Han Justin Kuang Derek Lim Stanford University jeffhan@stanford.edu kuangj@stanford.edu limderek@stanford.edu I. Abstract With online

More information

Airbnb Price Estimation. Hoormazd Rezaei SUNet ID: hoormazd. Project Category: General Machine Learning gitlab.com/hoorir/cs229-project.

Airbnb Price Estimation. Hoormazd Rezaei SUNet ID: hoormazd. Project Category: General Machine Learning gitlab.com/hoorir/cs229-project. Airbnb Price Estimation Liubov Nikolenko SUNet ID: liubov Hoormazd Rezaei SUNet ID: hoormazd Pouya Rezazadeh SUNet ID: pouyar Project Category: General Machine Learning gitlab.com/hoorir/cs229-project.git

More information

Prediction of Google Local Users Restaurant ratings

Prediction of Google Local Users Restaurant ratings CSE 190 Assignment 2 Report Professor Julian McAuley Page 1 Nov 30, 2015 Prediction of Google Local Users Restaurant ratings Shunxin Lu Muyu Ma Ziran Zhang Xin Chen Abstract Since mobile devices and the

More information

Linear model to forecast sales from past data of Rossmann drug Store

Linear model to forecast sales from past data of Rossmann drug Store Abstract Linear model to forecast sales from past data of Rossmann drug Store Group id: G3 Recent years, the explosive growth in data results in the need to develop new tools to process data into knowledge

More information

PREDICTION OF PIPE PERFORMANCE WITH MACHINE LEARNING USING R

PREDICTION OF PIPE PERFORMANCE WITH MACHINE LEARNING USING R PREDICTION OF PIPE PERFORMANCE WITH MACHINE LEARNING USING R Name: XXXXXX Student Number: XXXXXXX 2016-11-29 1. Instruction As one of the most important infrastructures in cities, water mains buried underground

More information

Predicting Corporate 8-K Content Using Machine Learning Techniques

Predicting Corporate 8-K Content Using Machine Learning Techniques Predicting Corporate 8-K Content Using Machine Learning Techniques Min Ji Lee Graduate School of Business Stanford University Stanford, California 94305 E-mail: minjilee@stanford.edu Hyungjun Lee Department

More information

Predicting Corporate Influence Cascades In Health Care Communities

Predicting Corporate Influence Cascades In Health Care Communities Predicting Corporate Influence Cascades In Health Care Communities Shouzhong Shi, Chaudary Zeeshan Arif, Sarah Tran December 11, 2015 Part A Introduction The standard model of drug prescription choice

More information

Cryptocurrency Price Prediction Using News and Social Media Sentiment

Cryptocurrency Price Prediction Using News and Social Media Sentiment Cryptocurrency Price Prediction Using News and Social Media Sentiment Connor Lamon, Eric Nielsen, Eric Redondo Abstract This project analyzes the ability of news and social media data to predict price

More information

A STUDY ON STATISTICAL BASED FEATURE SELECTION METHODS FOR CLASSIFICATION OF GENE MICROARRAY DATASET

A STUDY ON STATISTICAL BASED FEATURE SELECTION METHODS FOR CLASSIFICATION OF GENE MICROARRAY DATASET A STUDY ON STATISTICAL BASED FEATURE SELECTION METHODS FOR CLASSIFICATION OF GENE MICROARRAY DATASET 1 J.JEYACHIDRA, M.PUNITHAVALLI, 1 Research Scholar, Department of Computer Science and Applications,

More information

CS229 Project Report Using Newspaper Sentiments to Predict Stock Movements Hao Yee Chan Anthony Chow

CS229 Project Report Using Newspaper Sentiments to Predict Stock Movements Hao Yee Chan Anthony Chow CS229 Project Report Using Newspaper Sentiments to Predict Stock Movements Hao Yee Chan Anthony Chow haoyeec@stanford.edu ac1408@stanford.edu Problem Statement It is often said that stock prices are determined

More information

Accurate Campaign Targeting Using Classification Algorithms

Accurate Campaign Targeting Using Classification Algorithms Accurate Campaign Targeting Using Classification Algorithms Jieming Wei Sharon Zhang Introduction Many organizations prospect for loyal supporters and donors by sending direct mail appeals. This is an

More information

Using Decision Tree to predict repeat customers

Using Decision Tree to predict repeat customers Using Decision Tree to predict repeat customers Jia En Nicholette Li Jing Rong Lim Abstract We focus on using feature engineering and decision trees to perform classification and feature selection on the

More information

FORECASTING of WALMART SALES using MACHINE LEARNING ALGORITHMS

FORECASTING of WALMART SALES using MACHINE LEARNING ALGORITHMS FORECASTING of WALMART SALES using MACHINE LEARNING ALGORITHMS 1 Nikhil Sunil Elias, 2 Seema Singh 1 Student, Department of Electronics and Communication, BMS Institute of Technology and Management 2 Professor,

More information

Enhanced Cost Sensitive Boosting Network for Software Defect Prediction

Enhanced Cost Sensitive Boosting Network for Software Defect Prediction Enhanced Cost Sensitive Boosting Network for Software Defect Prediction Sreelekshmy. P M.Tech, Department of Computer Science and Engineering, Lourdes Matha College of Science & Technology, Kerala,India

More information

How much is my car worth? A methodology for predicting used cars prices using Random Forest

How much is my car worth? A methodology for predicting used cars prices using Random Forest How much is my car worth? A methodology for predicting used cars prices using Random Forest Nabarun Pal Department of Metallurgical and Materials Engineering Indian Institute of Technology Roorkee Roorkee,

More information

Big Data. Methodological issues in using Big Data for Official Statistics

Big Data. Methodological issues in using Big Data for Official Statistics Giulio Barcaroli Istat (barcarol@istat.it) Big Data Effective Processing and Analysis of Very Large and Unstructured data for Official Statistics. Methodological issues in using Big Data for Official Statistics

More information

New restaurants fail at a surprisingly

New restaurants fail at a surprisingly Predicting New Restaurant Success and Rating with Yelp Aileen Wang, William Zeng, Jessica Zhang Stanford University aileen15@stanford.edu, wizeng@stanford.edu, jzhang4@stanford.edu December 16, 2016 Abstract

More information

Unravelling Airbnb Predicting Price for New Listing

Unravelling Airbnb Predicting Price for New Listing Unravelling Airbnb Predicting Price for New Listing Paridhi Choudhary H John Heinz III College Carnegie Mellon University Pittsburgh, PA 15213 paridhic@andrew.cmu.edu Aniket Jain H John Heinz III College

More information

Appendix (Additional Materials for Electronic Media of the Journal) I. Variable Definition, Means and Standard Deviations

Appendix (Additional Materials for Electronic Media of the Journal) I. Variable Definition, Means and Standard Deviations 1 Appendix (Additional Materials for Electronic Media of the Journal) I. Variable Definition, Means and Standard Deviations Table A1 provides the definition of variables, and the means and standard deviations

More information

2015 The MathWorks, Inc. 1

2015 The MathWorks, Inc. 1 2015 The MathWorks, Inc. 1 MATLAB 을이용한머신러닝 ( 기본 ) Senior Application Engineer 엄준상과장 2015 The MathWorks, Inc. 2 Machine Learning is Everywhere Solution is too complex for hand written rules or equations

More information

Let the data speak: Machine learning methods for data editing and imputation

Let the data speak: Machine learning methods for data editing and imputation Working Paper 31 UNITED NATIONS ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS Work Session on Statistical Data Editing (Budapest, Hungary, 14-16 September 2015) Topic (v): Emerging

More information

Watts App: An Energy Analytics and Demand-Response Advisor Tool

Watts App: An Energy Analytics and Demand-Response Advisor Tool Watts App: An Energy Analytics and Demand-Response Advisor Tool Santiago Gonzalez, Case Western Reserve University, Electrical Engineering, SUNFEST Fellow Dr. Rahul Mangharam, Electrical and Systems Engineering

More information

ASSIGNMENT SUBMISSION FORM

ASSIGNMENT SUBMISSION FORM ASSIGNMENT SUBMISSION FORM Treat this as the first page of your assignment Course Name: Assignment Title: Business Analytics using Data Mining Crowdanalytix - Predicting Churn/Non-Churn Status of a Consumer

More information

Data Analysis Boot Camp

Data Analysis Boot Camp Data Analysis Boot Camp DATA200; 3 Days, Instructor-led Course Description Today's organizations face both a promise and a dilemma. The growth in availability and quantity of data, as well as the tools

More information

Real Data Analysis at PNC

Real Data Analysis at PNC Real Data Analysis at PNC Zhifeng Wang Department of Statistics Florida State University PNC Bank PNC is a Pittsburgh-based financial services corporation. I worked in Marketing Department = Decision,

More information

Brian Macdonald Big Data & Analytics Specialist - Oracle

Brian Macdonald Big Data & Analytics Specialist - Oracle Brian Macdonald Big Data & Analytics Specialist - Oracle Improving Predictive Model Development Time with R and Oracle Big Data Discovery brian.macdonald@oracle.com Copyright 2015, Oracle and/or its affiliates.

More information

Final Project Report CS224W Fall 2015 Afshin Babveyh Sadegh Ebrahimi

Final Project Report CS224W Fall 2015 Afshin Babveyh Sadegh Ebrahimi Final Project Report CS224W Fall 2015 Afshin Babveyh Sadegh Ebrahimi Introduction Bitcoin is a form of crypto currency introduced by Satoshi Nakamoto in 2009. Even though it only received interest from

More information

Predictive Modelling for Customer Targeting A Banking Example

Predictive Modelling for Customer Targeting A Banking Example Predictive Modelling for Customer Targeting A Banking Example Pedro Ecija Serrano 11 September 2017 Customer Targeting What is it? Why should I care? How do I do it? 11 September 2017 2 What Is Customer

More information

How hot will it get? Modeling scientific discourse about literature

How hot will it get? Modeling scientific discourse about literature How hot will it get? Modeling scientific discourse about literature Project Aims Natalie Telis, CS229 ntelis@stanford.edu Many metrics exist to provide heuristics for quality of scientific literature,

More information

Conclusions and Future Work

Conclusions and Future Work Chapter 9 Conclusions and Future Work Having done the exhaustive study of recommender systems belonging to various domains, stock market prediction systems, social resource recommender, tag recommender

More information

A Survey on Recommendation Techniques in E-Commerce

A Survey on Recommendation Techniques in E-Commerce A Survey on Recommendation Techniques in E-Commerce Namitha Ann Regi Post-Graduate Student Department of Computer Science and Engineering Karunya University, India P. Rebecca Sandra Assistant Professor

More information

Sunnie Chung. Cleveland State University

Sunnie Chung. Cleveland State University Sunnie Chung Cleveland State University Data Scientist Big Data Processing Data Mining 2 INTERSECT of Computer Scientists and Statisticians with Knowledge of Data Mining AND Big data Processing Skills:

More information

AP Statistics Part 1 Review Test 2

AP Statistics Part 1 Review Test 2 Count Name AP Statistics Part 1 Review Test 2 1. You have a set of data that you suspect came from a normal distribution. In order to assess normality, you construct a normal probability plot. Which of

More information

KnowledgeENTERPRISE FAST TRACK YOUR ACCESS TO BIG DATA WITH ANGOSS ADVANCED ANALYTICS ON SPARK. Advanced Analytics on Spark BROCHURE

KnowledgeENTERPRISE FAST TRACK YOUR ACCESS TO BIG DATA WITH ANGOSS ADVANCED ANALYTICS ON SPARK. Advanced Analytics on Spark BROCHURE FAST TRACK YOUR ACCESS TO BIG DATA WITH ANGOSS ADVANCED ANALYTICS ON SPARK Are you drowning in Big Data? Do you lack access to your data? Are you having a hard time managing Big Data processing requirements?

More information

IBM SPSS & Apache Spark

IBM SPSS & Apache Spark IBM SPSS & Apache Spark Making Big Data analytics easier and more accessible ramiro.rego@es.ibm.com @foreswearer 1 2016 IBM Corporation Modeler y Spark. Integration Infrastructure overview Spark, Hadoop

More information

MISSING DATA CLASSIFICATION OF CHRONIC KIDNEY DISEASE

MISSING DATA CLASSIFICATION OF CHRONIC KIDNEY DISEASE MISSING DATA CLASSIFICATION OF CHRONIC KIDNEY DISEASE Wala Abedalkhader and Noora Abdulrahman Department of Engineering Systems and Management, Masdar Institute of Science and Technology, Abu Dhabi, United

More information

SAS Machine Learning and other Analytics: Trends and Roadmap. Sascha Schubert Sberbank 8 Sep 2017

SAS Machine Learning and other Analytics: Trends and Roadmap. Sascha Schubert Sberbank 8 Sep 2017 SAS Machine Learning and other Analytics: Trends and Roadmap Sascha Schubert Sberbank 8 Sep 2017 How Big Analytics will Change Organizations Optimization and Innovation Optimizing existing processes Customer

More information

PREDICTION OF SOCIAL NETWORK SITES USING WEKA TOOL

PREDICTION OF SOCIAL NETWORK SITES USING WEKA TOOL PREDICTION OF SOCIAL NETWORK SITES USING WEKA TOOL G.Thirumani Aatthi 1, R.Aishwarya 2, R.Mallika 3, A.Angel 4 1 Assistant Professor, 2,3,4 M.Sc(CS&IT), Department of Computer Science & Information Technology,

More information

Movie Success Prediction PROJECT REPORT. Rakesh Parappa U CS660

Movie Success Prediction PROJECT REPORT. Rakesh Parappa U CS660 Movie Success Prediction PROJECT REPORT Rakesh Parappa U01382090 CS660 Abstract The report entails analyzing different variables like movie budget, actor s Facebook likes, director s Facebook likes and

More information

Practices of Business Intelligence

Practices of Business Intelligence Tamkang University Practices of Business Intelligence II Tamkang University (Descriptive Analytics II: Business Intelligence and Data Warehousing) 1071BI05 MI4 (M2084) (2888) Wed, 7, 8 (14:10-16:00) (B217)

More information

DATA SCIENCE: HYPE AND REALITY PATRICK HALL

DATA SCIENCE: HYPE AND REALITY PATRICK HALL DATA SCIENCE: HYPE AND REALITY PATRICK HALL About me SAS Enterprise Miner, 2012 Cloudera Data Scientist, 2014 Do you use Kolmogorov Smirnov often? Statistician No, I mix my martinis with gin. Data Scientist

More information

Module - 01 Lecture - 03 Descriptive Statistics: Graphical Approaches

Module - 01 Lecture - 03 Descriptive Statistics: Graphical Approaches Introduction of Data Analytics Prof. Nandan Sudarsanam and Prof. B. Ravindran Department of Management Studies and Department of Computer Science and Engineering Indian Institution of Technology, Madras

More information

Evaluation of Machine Learning Algorithms for Satellite Operations Support

Evaluation of Machine Learning Algorithms for Satellite Operations Support Evaluation of Machine Learning Algorithms for Satellite Operations Support Julian Spencer-Jones, Spacecraft Engineer Telenor Satellite AS Greg Adamski, Member of Technical Staff L3 Technologies Telemetry

More information

APPRENTICESHIP. Apprentice Employer Training Program Sponsor Warren County Career Center Your Local Educational Agency. Page 1

APPRENTICESHIP. Apprentice Employer Training Program Sponsor Warren County Career Center Your Local Educational Agency. Page 1 APPRENTICESHIP Apprentice Employer Training Program Sponsor Warren County Career Center Your Local Educational Agency Page 1 Table of Contents Statement of Purpose....2 What is an Apprenticeship?...3 What

More information

Sentiment analysis using Singular Value Decomposition

Sentiment analysis using Singular Value Decomposition International Journal of Current Engineering and Technology E-ISSN 2277 4106, P-ISSN 2347 5161 2016 INPRESSCO, All Rights Reserved Available at http://inpressco.com/category/ijcet Research Article Veena

More information

Assignment 1 (Sol.) Introduction to Data Analytics Prof. Nandan Sudarsanam & Prof. B. Ravindran

Assignment 1 (Sol.) Introduction to Data Analytics Prof. Nandan Sudarsanam & Prof. B. Ravindran Assignment 1 (Sol.) Introduction to Data Analytics Prof. Nandan Sudarsanam & Prof. B. Ravindran 1. In inferential statistics, the aim is to: (a) learn the properties of the sample by calculating statistics

More information

Using Text Mining and Machine Learning to Predict the Impact of Quarterly Financial Results on Next Day Stock Performance.

Using Text Mining and Machine Learning to Predict the Impact of Quarterly Financial Results on Next Day Stock Performance. Using Text Mining and Machine Learning to Predict the Impact of Quarterly Financial Results on Next Day Stock Performance Itamar Snir The Leonard N. Stern School of Business Glucksman Institute for Research

More information

How to build and deploy machine learning projects

How to build and deploy machine learning projects How to build and deploy machine learning projects Litan Ilany, Advanced Analytics litan.ilany@intel.com Agenda Introduction Machine Learning: Exploration vs Solution CRISP-DM Flow considerations Other

More information

Predictive Analytics Using Support Vector Machine

Predictive Analytics Using Support Vector Machine International Journal for Modern Trends in Science and Technology Volume: 03, Special Issue No: 02, March 2017 ISSN: 2455-3778 http://www.ijmtst.com Predictive Analytics Using Support Vector Machine Ch.Sai

More information

PERM FREQUENTLY ASKED QUESTIONS

PERM FREQUENTLY ASKED QUESTIONS Background: The first step in most employment based immigration processes is the filing of an Application for Alien Employment Certification or Labor Certification. The process by which this application

More information

DATA SETS. What were the data collection process. When, where, and how? PRE-PROCESSING. Natural language data pre-processing steps

DATA SETS. What were the data collection process. When, where, and how? PRE-PROCESSING. Natural language data pre-processing steps 01 DATA SETS What were the data collection process. When, where, and how? 02 PRE-PROCESSING Natural language data pre-processing steps 03 RULE-BASED CLASSIFICATIONS Without using machine learning, can

More information

Data mining: Identify the hidden anomalous through modified data characteristics checking algorithm and disease modeling By Genomics

Data mining: Identify the hidden anomalous through modified data characteristics checking algorithm and disease modeling By Genomics Data mining: Identify the hidden anomalous through modified data characteristics checking algorithm and disease modeling By Genomics PavanKumar kolla* kolla.haripriyanka+ *School of Computing Sciences,

More information

A Study of Financial Distress Prediction based on Discernibility Matrix and ANN Xin-Zhong BAO 1,a,*, Xiu-Zhuan MENG 1, Hong-Yu FU 1

A Study of Financial Distress Prediction based on Discernibility Matrix and ANN Xin-Zhong BAO 1,a,*, Xiu-Zhuan MENG 1, Hong-Yu FU 1 International Conference on Management Science and Management Innovation (MSMI 2014) A Study of Financial Distress Prediction based on Discernibility Matrix and ANN Xin-Zhong BAO 1,a,*, Xiu-Zhuan MENG

More information

Analytical Capability Security Compute Ease Data Scale Price Users Traditional Statistics vs. Machine Learning In-Memory vs. Shared Infrastructure CRAN vs. Parallelization Desktop vs. Remote Explicit vs.

More information

Fraud Detection for MCC Manipulation

Fraud Detection for MCC Manipulation 2016 International Conference on Informatics, Management Engineering and Industrial Application (IMEIA 2016) ISBN: 978-1-60595-345-8 Fraud Detection for MCC Manipulation Hong-feng CHAI 1, Xin LIU 2, Yan-jun

More information

MACHINE LEARNING BASED ELECTRICITY DEMAND FORECASTING

MACHINE LEARNING BASED ELECTRICITY DEMAND FORECASTING MACHINE LEARNING BASED ELECTRICITY DEMAND FORECASTING Zeynep Çamurdan, Murat Can Ganiz Department of Computer Engineering, Marmara University Istanbul/Turkey {zeynep.camurdan, murat.ganiz}@marmara.edu.tr

More information

Real Estate Appraisal

Real Estate Appraisal Real Estate Appraisal CS229 Machine Learning Final Project Writeup David Chanin, Ian Christopher, and Roy Fejgin December 10, 2010 Abstract This is our final project for Machine Learning (CS229) during

More information

Datameer for Data Preparation: Empowering Your Business Analysts

Datameer for Data Preparation: Empowering Your Business Analysts Datameer for Data Preparation: Empowering Your Business Analysts As businesses strive to be data-driven organizations, self-service data preparation becomes a critical cog in the analytic process. Self-service

More information

Data Warehousing Class Project Report

Data Warehousing Class Project Report Portland State University PDXScholar Engineering and Technology Management Student Projects Engineering and Technology Management Winter 2018 Data Warehousing Class Project Report Gaya Haciane Portland

More information

25 th Meeting of the Wiesbaden Group on Business Registers - International Roundtable on Business Survey Frames. Tokyo, 8 11 November 2016.

25 th Meeting of the Wiesbaden Group on Business Registers - International Roundtable on Business Survey Frames. Tokyo, 8 11 November 2016. 25 th Meeting of the Wiesbaden Group on Business Registers - International Roundtable on Business Survey Frames Tokyo, 8 11 November 2016 Michael E. Kornbau U.S. Census Bureau Session No. 5 Technology

More information

Can Cascades be Predicted?

Can Cascades be Predicted? Can Cascades be Predicted? Rediet Abebe and Thibaut Horel September 22, 2014 1 Introduction In this presentation, we discuss the paper Can Cascades be Predicted? by Cheng, Adamic, Dow, Kleinberg, and Leskovec,

More information

MAACCE 2014 Annual Conference May 8, Jones Nhinson Williams, BLS Programs Administrator Office of Workforce Information and Performance

MAACCE 2014 Annual Conference May 8, Jones Nhinson Williams, BLS Programs Administrator Office of Workforce Information and Performance MAACCE 2014 Annual Conference May 8, 2014 Jones Nhinson Williams, BLS Programs Administrator Office of Workforce Information and Performance Goal: Assist participants learn and understand how to identify

More information

Getting Started with OptQuest

Getting Started with OptQuest Getting Started with OptQuest What OptQuest does Futura Apartments model example Portfolio Allocation model example Defining decision variables in Crystal Ball Running OptQuest Specifying decision variable

More information

Data mining and Renewable energy. Cindi Thompson

Data mining and Renewable energy. Cindi Thompson Data mining and Renewable energy Cindi Thompson June 2012 Analytics, Big Data, and Data Science 1 What is Analytics? makes extensive use of data, statistical and quantitative analysis, explanatory and

More information

Data Analytics Training Program using

Data Analytics Training Program using Data Analytics Training Program using In exclusive association with 1200+ Trainings 20,000+ Participants 10,000+ Brands 45+ Countries [Since 2009] Training partner for Who Is This Course For? Programers

More information

CHAPTER ONE: OVEVIEW OF MANAGERIAL ACCOUNTING

CHAPTER ONE: OVEVIEW OF MANAGERIAL ACCOUNTING CHAPTER ONE: OVEVIEW OF MANAGERIAL ACCOUNTING The Basic Objectives of Accounting Basic objective of accounting is to provide stakeholders with useful information about a business enterprise in order to

More information

KENT STATE UNIVERSITY GUIDELINES FOR RECRUITING AND HIRING INTERNATIONAL PROFESSIONALS & FACULTY MEMBERS

KENT STATE UNIVERSITY GUIDELINES FOR RECRUITING AND HIRING INTERNATIONAL PROFESSIONALS & FACULTY MEMBERS KENT STATE UNIVERSITY GUIDELINES FOR RECRUITING AND HIRING INTERNATIONAL PROFESSIONALS & FACULTY MEMBERS Suggested practices to preserve the applicant s ability to apply for permanent residency ( green

More information