Content-Based Cross-Domain Recommendations Using Segmented Models

Size: px

Start display at page:

Download "Content-Based Cross-Domain Recommendations Using Segmented Models"

Moses Manning
5 years ago
Views:

1 Content-Based Cross-Doman Recommendatons Usng Segmented Models Shaghayegh Saheb Intellgent Systems Program Unversty of Pttsburgh Pttsburgh, PA ABSTRACT Cross-Doman Recommendaton s a new feld of study n the area of recommender systems. The goal of ths type of recommender systems s to use nformaton from other source domans to provde recommendatons n target domans. In ths work, we provde a generc framework for content-based cross-doman recommendatons that can be used wth varous classfers. In ths framework, we propose an e cent method of feature augmentaton to mplement adaptaton of domans. Instead of defnng the noton of doman based on tem descrptons, we ntroduce user-based domans. We defne meta-data features as a set of features to characterze the felds that domans come from and ntroduce ndcator features to segment users nto d erent domans based on values of the meta-data features. We study an mplementaton of our framework based on logstc regresson and perform experments on a dataset from LnkedIn to perform job recommendatons. Our results show promsng performance n certan domans of the data. 1. INTRODUCTION Recommender systems can help users to address the nformaton overload problem by provdng related tems consderng user s nterests. So far, most of the recommender systems were focused on specfc domans, recommendng one type of tem (such as books) to all categores of users. Recent research ntroduced cross-doman recommender systems that am to take advantage of shared nformaton among varous domans [10]. In cross-doman recommendaton, the goal s to use varous source doman nformaton to recommend tems n target domans. Prevous studes on collaboratve flterng cross-doman recommender systems has shown an mprovement of accuracy of recommendatons, especally n the cold-start case [14]. Most of the work on cross-doman recommender systems and the defnton of domans n them has been based on cross-doman collaboratve flterng methods and gnored the domans that can be defned based on user specfcatons. L [10] has categorzed the domans n cross-doman recommendaton nto system, Copyrght 2014 for the ndvdual papers by the paper s authors. Copyng permtted for prvate and academc purposes. Ths volume s CBRecSys publshed 2014, and copyrghted October 6, 2014, by ts Slcon edtors. Valley, CA, USA. Copyrght CBRecSys 2014, byoctober the author(s) 6, 2014, Slcon Valley, CA, USA. Trevor Walker LnkedIn Mountan Vew, CA twalker@lnkedn.com data, and temporal domans. These domans are related to, respectvely, d erent datasets that a recommender system s bult upon, varous representaton of user preferences (explct or mplct), and varous tme ponts n whch the data s gathered. Although ths s a good classfcaton of possble domans n recommender systems, t focuses on the type or domans that are defned based on tems. In other words, usually the noton of doman s selected as a constant characterstc of tems, systems, etc. For example, type of tems (e.g. books, moves, etc.)[14], genre of tems (e.g. for moves) [1], or ndcators of varous systems that the data s gathered from [6] are some of features that have been used as doman ndcators. Josh et. al. [5] have chosen domans based on meta-data features. These meta-data features can be selected from all unque subsets of features by expermentng (e.g. selectng the best performng features on a valdaton set) whch s a tme-consumng task. Choosng the proper doman ndcator among features s stll an open feld of research. In ths paper, we propose a framework for performng content-based cross-doman recommendaton. We propose that n content-based and hybrd recommender systems, the doman noton can be extended to the type or doman of users. Here, the defnton of doman can be determned based on the recommendaton task and users nformaton, such as users demographc data. For example, n a move recommendaton task, age of user can be an e ectve factor n decdng whch moves match the best for her. As another example, n job recommendaton, we expect the model parameters to be d erent for d erent job functons of users. For a desgner, t s mportant to have matchng sklls wth the job descrpton, whle for a network engneer, hs certfcates mght have more mportance. We choose job recommender applcaton n our experment n ths paper, although t s applcable to other recommender system domans. Havng many d erent jobs lsted onlne n varous ndustres wth d erent job descrptons, t s essental for people to fnd the job that best matches ther abltes and specfcatons. Searchng for the rght job s a tme-consumng task for a user. It needs spendng a lot of e ort on defnng the crteron the user s lookng for. Job recommender systems can address ths problem by actvely fndng good job matches for the user, utlzng her profle nformaton, search keywords, etc. Based on prevous results n the job recommendaton lterature [8, 9] and our feld experence, we beleve that job recommendatons can beneft from cross-doman nformaton. 57

2 In ths paper, our proposed framework can be utlzed by varous algorthms defned on any noton of doman from data attrbutes. Our work also d ers from the exstng lterature n defnng domans on the user profle sde nstead of tem sde. It can, of course, be used for domans defned on tem-set features. We experment wth LnkedIn data for job recommendatons. Our experments lead to promsng results for content-based cross-doman recommendatons based on user job functons. In Secton 2, we brefly dscuss related lterature. In Secton 3, we ntroduce our approach to content-based cross-doman recommendaton. We present our dataset and experment setup n Secton 4 and we dscuss the results n secton RELATED WORK 2.1 Segmented Regresson Model Segmented regresson or pece-wse regresson [13] can be used as a classfcaton method n whch data features are parttoned nto ntervals usng some breakponts. In the fnal model, a separate model wll ft each of the segments. Ths model s useful for approxmatng hgher-degree models wth multple lower-degree models n smaller ranges. In ths paper, we adopt ths method to our problem of cross-doman recommendaton. 2.2 Feature Augmentaton n Doman Adaptaton Feature augmentaton was ntroduced by Daumé [3] n domanadaptaton lterature. In hs paper, Daumé consders a source doman and a target doman separately. He augments each of these domans ndvdually by copyng the feature space three tmes: once copyng all features for the general verson, and once for each of the source and target versons. Eventually, the augmented source data wll contan only general and source-specfc versons and the augmented target data contans general and target-specfc versons. After Daumé s paper, ths method have been used n the doman-adaptaton feld, especally n Natural Language Processng (NLP)[2, 4, 5]. Our approach mproves Daumé s model n the possblty of havng multple meta-data features for defnng the domans (nstead of havng one dmenson of source and target domans). In addton, each of the meta-data features can have multple values and defne mult-dmensonal domans. Moreover, we can have separate sets of common (overlappng) and uncommon features n the man-e ect and doman-specfc models. Our model s extensble to ncorporatng cross-products of doman ndcator features. 2.3 Job Recommendaton Despte of the mportance of job recommender systems, there have not been many research on ths subject. Rafter et al. [12] ntroduced CASPER, an ntellgent onlne recrutment servce. Kem [7] provded a multlayer framework to support the matchng of ndvduals for recrutment processes. In [11] Hutterer used hybrd user proflng to enhance the job recommendaton results. He ncorporated explct and mplct feedback of user n the user profle. Lee and Bruslovsky [8, 9] mplemented and expermented wth Proactve, whch has multple nterfaces for varous types of users. They showed that d erent users use varous nformaton resources to look for the perfect job. 3. OUR FRAMEWORK: SEGMENTED MODEL FOR CROSS-DOMAIN RECOMMENDA- TION In cross-doman recommendaton, we am to buld a model that can be general and flexble enough, to transfer the nformaton n multple related domans, and specfc enough, to capture partcular aspects of each ndvdual doman. Ths means that we expect a trade o between the bas and the varance n our model. We want all models to be close to each other n partcular dmensons (havng less varance) and we want them to be based towards each doman s specfc dstrbuton. For example, f we thnk of varous job functons as d erent domans n job recommendaton, we expect the user profle to have a good match to the job descrpton n all domans (common feature of the domans). Also, we expect the sklls feature to be more mportant for an artst than a unversty professor (doman-specfc feature). If we consder one man model for all of the data present n varous domans, we are gong to have no varance n the model, but we wll lose the bas we are lookng for. On the other hand, f we treat each doman wth a separate model, we wll acheve the bas each doman s ntroducng, but we wll have too much varance n the acheved models. In other words, we wll lose the ablty to transfer common nformaton among d erent models. Our framework conssts of two parts: the man-e ect model, and doman-specfc models. The man-e ect model s to model the shared statstcs among all domans. We have one doman-specfc model per doman to capture the doman-specfc characterstcs. A general formulaton of model can be seen n Equaton 1. Fnal-Model = Man-E ect Model+ Doman-Specfc Model (1) 2Domans 3.1 Cross-Doman Augmentaton and Segmentaton As sad before, we characterze the felds that the domans come from by some features called meta-data features. Each dataset has a set of features, such as user-related features, tem-related features, and features that represent smlartes between users and tems, that we call them base features. Meta-data features are a subset of base features, whch specfy aspects that we want to defne the domans based on. Each doman s constructed based on the values of these meta-data features and ther combnatons. For example, f we want to recommend moves to users, base features are user features, such as age, educaton, language, etc, tem features, such as move genre, actors, drector, etc, or the relatonshp between users and tems, such as the smlarty between each move genre and genres that a user lkes. We can choose some of these base features as meta-data features to defne the noton of doman based on them. For example, we can choose the genre base feature as the meta-data feature. In ths case, the defned domans wll be acton moves, drama moves, acton-drama moves, etc. If we choose two meta-data features, the domans can be a combnaton of val- 57

3 ues for those meta-data moves. For example, f we choose genre of move and age of user as meta-data features, the domans wll be lke: acton-mddle-age, drama-young, etc. In the case of user-based domans for job recommendaton, we can choose some base features of users, such as job functon, or job senorty of users, as meta-data features. For example, f we want to defne the domans based on job functon, people who have IT job functon form one doman and people who have medcal job functon form another doman Augmentaton Each of the doman-specfc models n Equaton 1 works on one doman s data. As a result, we should splt the dataset for each doman and provde each doman-specfc model wth the secton of the dataset related to that doman. To address ths splttng, we expand on the dea of segmented regresson model. We propose to augment the feature space based on the doman defntons and copy each datapont nto the related doman s sub-space. The man space s then used for the man-e ect model and each copy of the space s used for the related doman-specfc model. However, ths augmentaton has a problem: f we have k d erent domans (e.g. k d erent job functons), we need to partton the data space nto 2 k separate segments (copes) to capture all of d erent settngs of the dataset whch takes too much space. Each one of these copes s for each subset of the possble combnaton of meta-data feature values (domans). For example, f we want to partton users based on the job functon represented n ther resume, and we have three d erent possble job functons (e.g. operatons, educaton, and sales), we wll have to partton the data nto 2 3 = 8 segments: people for whom the job functon s n operatons, the ones who have sales functon, the ones who have educaton job functon, the ones who have sales and operaton functons, and etc. In addton to the space problem, segmentng the data nto combnatons of domans may lead to very sparse copes of the orgnal dataset. Addtonally, lookng at all varous combnatons of the domans and ther nteractons mght not be necessary for our purpose Indcator Features for Segmentaton To allevate the problem ndcated n Secton 3.1.1, we expand on the dea of Segmented Regresson Model and ntroduce ndcator features for each doman. These features allow us to augment the feature space n a polynomal order, whle beng able to keep the man e ect model and control the granularty of combnatons among d erent domans. Suppose that each meta-data feature can take k d erent values and segment our data nto k domans and suppose that we are choosng only one meta-data feature. For each of the k possble values n each of the domans, we defne a bnary ndcator feature, representng f a data pont falls nto that specfc doman or not. As a result, we wll end up wth k bnary ndcator features for the selected meta-data feature. Eventually, we wll augment the feature space based on the ndcator features n the followng way: We keep the orgnal feature space for the common features used n the man effect model. For each of the features fallng nto the domanspecfc models, we augment the feature space by copyng t k tmes for each of the meta-data feature values. Consequently, we have a polynomal expanson of space. Note Fgure 1: Augmented Feature Space wth c Common Features n the Man E ect Model, f Features n Each of the d Domans, that Are Represented by Indcator Features wth k Possble Values that we do not consder combnatons of meta-data feature values yet. Now, f we have d meta-data features to choose the domans from, each of whch can take k values, and n each of the doman-specfc models, f of the base-features exst, we should replcate ths f dmensonal space for dk+1 tmes. Ths number s polynomal n d (number of meta-data features) and k (number of values for each meta-data feature). Whle f we have not used the ndcator features n segmented model, the f-dmensonal feature space should have been replcated for d k tmes. Consderng havng c common features for the Man E ect Model, we can represent the new feature space by Fgure Challenges and Advantages An advantage of ths framework s ts extensblty to hgher order cross-products of values between and wthn domans. For example, f we want to consder the e ect of nteracton between two domans, we can extend the model to consder an ndcator feature, representng cross-products of feature values n the domans. E.g. f we want to take nto account the combnaton of every two feature values wthn each of the domans, we wll end up wth a space that has O(c + f(dk +1)+f(dk 2 + 1)) dmensons or s expanded for dk + dk tmes. Ths gves us the ablty to control the dmensonalty of feature space whle avodng the sparsty n each of the segments. Another challenge s choosng the features that should be n the man-e ect model and the ones that should reman as doman-specfc features. In other words, whch features should be responsble for transferrng the nformaton among domans (controllng the varance) and whch ones should provde the doman-specfc bas? In our approach, the model can learn whch features to use n the man e ect model and whch features to use n each of the domans usng regularzaton. Snce regularzaton mposes coe cent values to be as close to zero as possble, the less mportant coe cents of the model wll have very small values and are removed from the model. Whle the model can choose between these sets of features, we can also ntalze the man e ect and 58

4 doman-specfc features by the expert s doman knowledge. Besdes, n some of the cross-doman recommender problems, each doman has ts own subset of a general feature set, whch mght d er n the sze or type wth other domans feature sets. We expect our cross-doman soluton to consder ths problem and be extensble to domans wth heterogeneous number and types of features. 3.2 Implementaton Usng Logstc Regresson Although the approach we presented here can be used n varous classfcaton algorthms, we used a straghtforward classfcaton algorthm to mplement the model. Suppose that f c s the th common feature among the domans; M s the set of meta-data features; V j s the set of values (domans) for the j th meta-data feature; I j s the bnary ndcator feature for the doman of meta-data feature j; f jk represents the k th base feature specfc to the th doman of meta-data feature j; and p s the probablty of the model s outcome. Equaton 2 shows the resultng segmented regresson model wth ndcator features. As we can see, t has a smple representaton that can be mplemented easly for doman adaptaton. logt(p) = w c f c + I j j2m 2V j w fjk f jk (2) Here w represent the weght (mportance) of each feature n the model. In case we want to extend t to havng twoway nteracton e ects of values of each meta-data feature (belongng to two domans), we wll have: logt(p) = w c f c + I j j2m 2V j I,l,j j2m 2V j l2v j k w fjk f jk k (3) w f,l,j,k f,l,j,k In Equaton 3 I,l,j represents the bnary ndcator feature for the datapont belongng to both and l domans (or havng both and l values) of meta-data feature j. To decde whch features should be n the common set of features and whch should be n each of the doman-specfc models, we use L 2 regularzaton. 4. EPERIMENTAL SETUP 4.1 Data The dataset we are usng n ths study s LnkedIn s job applcaton data. It contans records of users and job features and a bnary label ndcatng f the user has appled for the job or not. Some of base features are calculated smlartes between the job and the user. For example, we use TF.IDF to calculate the smlarty of job descrpton wth user s sklls and store t as a feature n the user-job record. Some other features, from whch we have chosen the metadata features, are user-specfc. For example, the past and current job functons of a user, past and current ndustres k Fgure 2: Coverage of User Current Job Functons n O ne Data the user has worked at, or the geographcal locaton of user. Meta-data features should have categorcal values, so that we can extract bnary ndcator features from them. In case we want to use features wth contnuous values, we partton values nto more than one category. The o ne dataset used n the followng experments conssts of three mllon records of more than 150, 000 users. There s a one to ten rato of postve job applcatons to negatve job applcatons n the dataset. We use 100 user-job features as base features n the model. We pck one meta-data feature (user s current job functon) from user-specfc features and splt domans based on t. Ths feature s specfcally related to what a user does n hs/her job, e.g., a user can be an IT (job functon) engneer n a bank. We choose ths feature based on our experence that people workng n varous functons (e.g. arts and legal doman) have d erent requrements and defntons for a good job recommended to them. Based on the LnkedIn data, current job functons of a user can have 26 d erent values. Each user can have multple job functons at the same tme. The dstrbuton of user job functons s not unform n the dataset: some functons are more covered n the dataset and some nclude less number of users. Fgure 2 shows the coverage of current job functons n the data. As we can see n the pcture, Sales, Operatons, and Informaton Technology are among the most covered job functons n the data and Communty and Socal Servces, Real Estate, and Mltary and Protectve Servces are the job functons wth least coverage. 4.2 Implementaton of Models for Job Recommendaton As we dscussed n secton 3, we can have two extremes of modelng users as our baselnes: a) when there s only one man model for all of the users, and b) when there s a separate model for each segment of users and there s no shared 59

Fgure 3: Three Experment Settngs wth D erent Granulartes of Doman Defnton nformaton among the models. In each of our studes, we compare our model to at least one of these two baselnes.

segmentng on all ndcator features of a doman (all-ndcators), and segmentng on clusters of ndcator features of a doman (doman-clusters).

5 Fgure 3: Three Experment Settngs wth D erent Granulartes of Doman Defnton nformaton among the models. In each of our studes, we compare our model to at least one of these two baselnes. To capture the e ect of doman granularty on recommendaton results, we experment wth three d erent settngs for domans: segmentng on two doman ndcator features versus all other domans (two-vs-all), segmentng on all ndcator features of a doman (all-ndcators), and segmentng on clusters of ndcator features of a doman (doman-clusters). Fgure 3 shows a graphcal demonstraton of user segmentaton n each case for job functons. For the two-vs-all model, we choose the two most covered values of the selected meta-data feature and defne three ndcator features based on that: the ndcator feature that selects users wth the most covered value, the ndcator feature that selects users wth the second most covered value, and the ndcator feature for the rest of users. For example, for user s job functon meta-data feature, we wll have the followng ndcator features: I S for users wth Sales job functon, I O for users wth Operatons job functon, and I Other for all other users. The fnal model s presented n Equaton 4. Here, fc shows the th common features among domans, f S, f O, and f Other are features used n the Sales, Operatons, and Other domans respectvely, w j s the weght for feature j, and p s the probablty that the target user apples for the target recommended job. Note that snce we have chosen only one meta-data feature, we do not need to present t n the model (e.g. j 2 M n Eq. 2). Also, note that by ncludng I Other as an ndcator feature, we are capturng the e ect of the nteracton or cross-product of Sales and Operatons domans. logt(p) = w c fc + I S w fs f S + (4) I O w fo f O + I Other w fother f Other For the all-ndcators model, we segment based on all values of the selected meta-data feature. If the meta-data feature can take k values, we wll end up wth k ndcator features to segment all users nto k d erent parttons. For user s job functon meta-data feature we end up wth 26 d erent ndcator features. Our fnal model s shown n Eq. 5. logt(p) = Here, I w c fc I w f,j f,j (5) shows the ndcator feature for doman (d er- j Fgure 4: Clusters of User Job Functon ent values of job functon); and f,j represents the j th base feature of doman. For the last set of experments (doman-clusters), we cluster the values of meta-data features nto groups. Each group represents a cluster of domans. We use one ndcator feature for each group. We use spectral clusterng to group 26 d erent job functons nto 8 clusters. The clusters are based on the user transton between job functons n the data. Fgure 4 shows a tag-cloud representaton of these clusters. Each color ndcates one cluster. As we can see n the pcture, functons lke Sales and Marketng are clustered together and Research and Educaton functons fall n one cluster. We run the segmented model havng one ndcator per cluster. Suppose that C s the set of cluster ndcators for a meta-data feature. The fnal model s shown n Equaton 6. logt(p) = w c fc + 2C I w f,j f,j (6) The fnal model we have n doman-clusters s smlar to the all-ndcators model, but doman ndcators are representatve of each cluster of job functons. 5. PERFORMANCE ANALYSIS We experment n the two-vs-all and doman-clusters settngs for current job functon of users as meta-data feature. We dvde the data nto 70% tran and 30% test subsets. We measure accuracy of the algorthms on the test set. To fnd the performance of algorthm n each domans of the data, we partton the test set nto domans n the same way that we parttoned the tranng set and calculate the accuracy n each domans of the dataset. Our model s compared to at least one of the two baselne models: one-forall and ndependent models. The one-for-all model only contans one model for all of the dataponts, gnorng the doman-specfc models. The ndependent model trans a separate model for each of the domans ndependently. Ths model gnores the common nformaton among the domans and treats them as ndependent from each other. To dg deeper nto the o ne results, we look at the coe - cent values obtaned by the algorthm n each of the models. j 60

6 Table 1: Accuracy of two-vs-all vs. one-for-all and ndependent models for job functons Doman One-for-All Two-vs-All (Segmented) Independent Sales 96.28% 96.33% 95.01% Operatons 96.43% 96.49% 94.93% Sales and Operatons 96.54% 96.58% NA Other 96.44% 96.45% 96.44% 5.1 Two vs. All Model As explaned n Secton 4, n ths two-vs-all settng two most covered domans are compared to the rest of the domans. We compare our model wth the two base models: onefor-all and ndependent models. The most covered job functons n the data are Sales (about 12% coverage) and Operatons (about 8% coverage). The accuracy results for the job functon meta-data feature are shown n Table 1. The Sales and Operatons row represent the doman wth users n both Sales and Operatons domans and the Other row shows the users who are not n any of Sales or Operatons domans. The frst two rows show the users who are only n Sales or only n Operatons domans respectvely. As we see n table 1, the segmented model has slghtly hgher accuracy than the base models. The d erence s bgger for the ndependent base model, specally n the two mostcovered domans. To understand how the models work dfferently, we look at the coe cents assgned to base varables of two-vs-all segmented model and one-for-all base model n Fgure 5. In ths pcture, we can compare the coe cent values for these two models. The two-vs-all segmented model can have more than one coe cent for each varable: the varable mght repeat n the man-e ect part of the model or n each of the doman-specfc parts of the model. To be able to compare the coe cent values, we use the average coe cent value of the man-e ect and domanspecfc parts of the two-vs-all segmented model. The red dots represent ths average value and the bars around them represent the varance of these coe cents n the model. The blue dots are coe cent values n the one-for-all base model. As we can see n the pcture, some of coe cents have a d erent value n the two models. For example, the smlarty of user sklls wth job descrpton has more mportance n the two-vs-all segmented model. In addton, we can see that some of base varables exstng n the two-vs-all segmented model, do not exst n the one-for-all base model. The reason s that these varables were removed automatcally durng the regularzaton process. For example, the smlarty between user locaton and the locaton of the job s only present n the two-vs-all segmented model. Based on Fgure 5, the two-vs-all model s coe cents have d erent varance n the man-e ect and doman-specfc models. Some of the coe cents vary more than the others. Ths can ndcate that ths model s able to capture the d erence between d erent domans. To understand f the two-vs-all segmented model s capturng the d erence between each of the domans, we look at coe cents of varables n all four job functon domans (Sales, Operatons, Sales and Operatons, and other) n Fg- Fgure 5: Coe cent Values for Two-vs-All Segmented Model (Red) Compared to the One-for-All Base Model (Blue) Table 2: Accuracy of doman-clusters segmented model vs. one-for-all model for job functons Doman One-for-All Doman Clusters (Segmented) Cluster % 96.52% Cluster % 96.26% Cluster % 96.75% Cluster % 97.09% Cluster % 97.61% Cluster % 96.85% Cluster % 96.61% Cluster % 96.36% ure 6. The coe cent values are shown as stacked over each other n the pcture. Snce there are many base varables n the model and ther names are not easly readable, we removed the names n fgures of ths secton. As we can see, coe cent values for some of the varables are d erent for varous domans. Wth a closer look we can fnd the dfferences n coe cent values. For example, the smlarty of past postons of user to the job descrpton s more mportant for the Sales doman than the Operatons doman. The smlarty of user sklls to the job s requred sklls are more mportant for users n the Operatons doman than Sales doman. 5.2 Doman Clusters Model As explaned n secton 4, user job functon meta-data features are grouped nto 8 clusters. The accuracy results for the clusters n the job functon meta-data feature s shown n Table 2. As we can see n ths table, the accuracy of the models are very close to each other, for some clusters the baselne models work better than the doman-clusters segmented model and for others t s the reverse. We compare coe cent values of the one-for-all baselne and doman-clusters model to have a more detaled nsght of the results. Lookng at the d erences of coe cent values for each of the domans n the segmented model, we can understand how d erent features are more mportant for each of the domans. Fgure 7 shows the coe cent values of doman-clusters model for the job functon meta-data feature. As we can see n the pcture, some of the coe cents are more mportant n some of the domans and less n others. For example, The smlarty of user s prevous searches to the job descrpton s more mportant to users n cluster 2 (ncludng sales, marketng, and smlar job functons). 61

7 Fgure 6: Coe cent Values for D erent Domans n Two-vs-All Segmented Model Fgure 7: Coe cent Values for Doman Clusters Segmented Model for Job Functons Fgure 8: Accuracy of for All Indcators Segmented Model for Job Functons 62

8 classfer algorthms, expanson of experments of the model usng d erent meta-data features, and expermentng on nteracton of varous possble domans. Automatc selecton of meta-data features s another nterestng drecton of research. Fgure 9: Coe cent Values for All-Indcators Segmented Model (Red) Compared to the One-for-All Base Model (Blue) 5.3 All Indcators Model In ths model, we pck all of the possble values for a metadata feature as doman ndcators: each ndcator feature s representatve of one of the values the meta-data feature can take. Snce we choose job functon as our meta-data feature, we wll end up wth 26 domans related to job functons, such as IT, sales, engneerng, real estates, and marketng. We can see accuracy results of the all-ndcators segmented model and the two one-for-all and ndependent baselne models n Fgure 8. As we can see here, the allndcators model s usually slghtly better than the two other models. In some of the domans (such as Product Management (number 19) and Real Estate (number 23) domans) the one-for-all model has more accuracy than all-ndcators model. Comparng coe cents of one-for-all and all-ndcators models n Fgure 9, we can see that some of the base features have a very d erent coe cents n the model. Also, some of base features have a large varance n d erent domans of the all-ndcators model. There are some base features that are removed from the one-for-all model by regularzaton, whle they stll play a role n the all-ndcators model. 6. CONCLUSION AND FUTURE WORK In ths paper, we propose a framework for content-based cross-doman recommender systems. Ths framework s flexble enough to be mplemented wth varous classfers. The model n ths framework can transfer common nformaton among d erent domans whle keepng them dstnct. We defne user-based domans based on users meta-data features and mplement our framework usng logstc regresson. The regularzaton n the model allows us to pck mportant features of each of the domans automatcally, whle keepng t flexble to accept expert knowledge n choosng the features. We experment on job recommendatons for LnkedIn users. Our results ndcate slght mprovement n recommendaton accuracy n the o ne settng. Furthermore, the expermental results are promsng: ) d erent features have d erent coe cent values n each of the domans; and ) coe cents are d erent n the cross-doman model compared to the one-for-all base model. As a result, we are hopeful that ths model can be a good ft to our problem n the onlne experments (A/B testng). We expect several drectons for future work: mplementng the framework based on varous 7. REFERENCES [1] S. Berkovsky, T. Kuflk, and F. Rcc. Medaton of user models for enhanced personalzaton n recommender systems. User Modelng and User-Adapted Interacton, 18(3): ,2008. [2] J. H. Clark, A. Lave, and C. Dyer. One system, many domans: Open-doman statstcal machne translaton va feature augmentaton. In Proceedngs of the Tenth Bennal Conference of the Assocaton for Machne Translaton n the Amercas,2012. [3] H. Daumé III. Frustratngly easy doman adaptaton. In ACL, volume 1785, page 1787, [4] L. Duan, D. u, and I. Tsang. Learnng wth augmented features for heterogeneous doman adaptaton. arv preprnt arv: , [5] M. Josh, M. Dredze, W. W. Cohen, and C. P. Rosé. What s n a doman? mult-doman learnng for mult-attrbute data. In Proceedngs of NAACL-HLT, pages , [6] M. Kamnskas and F. Rcc. Locaton-adapted musc recommendaton usng tags. In User Modelng, Adapton and Personalzaton, pages Sprnger, [7] T. Kem. Extendng the applcablty of recommender systems: A multlayer framework for matchng human resources. In System Scences, HICSS th Annual Hawa Internatonal Conference on, pages , [8] D. Lee and P. Bruslovsky. Fghtng nformaton overflow wth personalzed comprehensve nformaton access: A proactve job recommender. In Autonomc and Autonomous Systems, ICAS07. Thrd Internatonal Conference on, pages 21 21, [9] D. Lee and P. Bruslovsky. Proactve: Comprehensve access to job nformaton. Journal of Informaton Processng Systems, 8(4): , December [10] B. L. Cross-doman collaboratve flterng: A bref survey. In Tools wth Artfcal Intellgence (ICTAI), rd IEEE Internatonal Conference on, pages IEEE, [11] H. M. Enhancng a Job Recommender wth Implct User Feedback. PhD thess, Fakult Lt f r Informatk der Technschen Unverst Lt Wen, [12] R. Rafter, K. Bradley, and B. Smyth. Personalzed retreval for onlne recrutment servces. In In Proceedngs of the 22nd Annual Colloquum on Informaton Retreval, [13] H. Rtzema. Frequency and regresson analyss (chapter 6). Dranage prncples and applcatons, 16: , [14] S. Saheb and P. Bruslovsky. Cross-doman collaboratve recommendaton n a cold-start context: The mpact of user profle sze on the qualty of recommendaton. In User Modelng, Adaptaton, and Personalzaton, pages Sprnger,