MobiDev：选择构建基于NLP的文本推荐系统的最佳方式（英文版）（14页）.pdf

编号：120520

PDF DOCX 14页 1.11MB 下载积分：VIP专享

下载报告请您先登录！

MobiDev：选择构建基于NLP的文本推荐系统的最佳方式（英文版）（14页）.pdf

1、Table of ContentsHow do NLP-based Recommendations Work?Choosing One of 5 NLP-Based Recommendation Approaches1.Text Similarity2.Named Entity Recognition3.Topic Extraction4.Keyword Extraction5.Text SummarizationWhere to Apply Recommendations Using NLP1Recommendation systems built with machine learning

2、 can solve one of the mosttedious tasks of gathering customer data,which is to study their preferencesand suggest relevant information in the future.Besides searches,recommendations,or whats also called discovery,provides customers with anendless stream of information that is relevant to their searc

3、h history,preferences,and which generally helps them to find what they want muchfaster.Machine learning recommendations are based on keywords,user activity,andother similar measures that help us define what a person may like.But theybecome ineffective if the user preference involves thousands of fil

4、ters andsubjective criteria that are specific to the user.So here,we discuss NaturalLanguage Processing recommendation systems.Based on our previousexperience,well outline the general idea and limitations of this model,andexplain some of the best approaches for how to build a NLP-basedrecommender sy

5、stem.How do NLP-based Recommendations Work?Natural Language Processing,or NLP,is good at handling plain text andcolloquial speech.You can find tons of sentiment analysis or documentprocessing cases that rely on NLP to solve the task of working with writtenlanguage.These capabilities can be applied t

6、o recommendations as well,if weunderstand our inputs and outputs right.Any recommendation system performs a basic function for a user.It matchesuser expectations with the discovered content,no matter if it was an intendedrequest or not.Recommendations are formed by learning the previous activityof t

7、he user,and assigning categories to the content pieces that we call“filters”.It is a known fact that the input of any AI model is numeric data.Obviously,whenit comes to data such as the age of a user or years of experience,AI models caneasily operate with these values.To work with this type of data,

8、it will benecessary to follow the steps according to the CRISP-DM methodology,namely,analyze the data,understand how best to transform it into a training format andperform the conversion.Think,for example,of a professional network likeLinkedIn where a user applies filters to search for a career oppo

9、rtunity.2Basic recommendations using content-based filtering for job searchBut,how can we handle text data?How can we convert words into numericdata?Descriptions,commentaries,and colloquial speech messages can specifydata points like years of experience with a specific technology that may becrucial

10、for correct matching.NLP methods can help the user to provide search input in a free form andwithout any restrictions,according to the systems requirements,to get ananswer to the desired request.Text information can store a large amount ofdata that cannot be covered by ordinary filters,as in the exa

11、mple above.3NLP-based recommendations for job search matchingIn this simple example,we can see that extracting data from text can instantlyimprove matching results,and save time both for the user,and the companythat searches for a professional.So now,lets discuss what approaches we canapply to work

12、with text data and build an NLP-based recommender system.Choosing One of 5 NLP-BasedRecommendation ApproachesLets consider this task using the Data Science and AI Jobs Indeed dataset,whichcontains information about various job posts in the field of AI.In the context ofthis dataset,the matching task

13、can be formulated as follows:to find a suitablevacancy for a candidate,based on the skills and experience described by him.So asan input,well get tons of written descriptions that convey information about theskillset of our potential employees.Our dataset contains the following textual data:title of

14、 the vacancylocationA detailed description of the requirements for the candidate4Dataset information exampleGiven that the dataset was scraped from a real professional network,we canunderstand which NLP methods will work the best in a real world environment.1.Text SimilarityThe text similarity appro

15、ach provides the coefficient of similarity of two texts,comparing the vectors of both texts.In this case,the feature that will be usedwhen building the AI model is the value of this coefficient.Below is an exampleof the work of the text of the similarity model,where the similarity score is thecoeffi

16、cient of similarity of the job description with the description of thecandidates skills.Text similarity model output example5Text similarity models can be used in a raw form,because they provide moreaccurate results and indicate whether two texts explain the same thing based onsemantic analysis.Addi

17、tionally,text similarity can also be used in conjunctionwith other NLP models like text summarization.2.Named Entity RecognitionAnother NLP approach is called Named Entity Recognition or NER.The essence ofthis method is to find named entities in the text,such as:location,organization,person,and so o

18、n.As long as we target certain keywords,we want to be sure themodel indicates named entities accurately,so that the recommendation systemdoesnt mistake organization names with locations,or technologies withprojects.Below is an example of using the NER model on the location entity.The featurethat wil

19、l be used for modeling will be the coefficient of the similarity of twoentities,as shown below.Named Entity Recognition resultsThis method can be useful in replacing standard filters,such as choosing a city,or choosing a previous place of work,allowing the user to provide searchqueries in free form.

20、63.Topic ExtractionThe topic extraction method will help to identify implicit subgroups in the trainingdata(in our case,in the list of vacancies),into which the input data can then beclassified.For example,speaking about our dataset,namely job titles,the topicextraction model(BERTopic)helped to iden

21、tify several topics,each of which isdescribed by the 5 most common words.Most common topics of the Data Science and AI Jobs Indeed datasetIn order to use this information when building a matching model,you candesignate a feature that will indicate the similarity of two texts in terms of thesimilarit

22、y of the topics found,as shown in the figure below.74.Keyword ExtractionThe keywords extraction method will help to check the presence of the mostimportant keywords in the input text,as shown below.Automatic keywordextraction is useful for parsing the text and denoting which parts of a sentence,or s

23、eparate words are the key ones.Further,we can compare how manykeywords are met in a target text like the one shown in the example below.Keyword extraction model output5.Text SummarizationIn addition to all of the above mentioned methods,you can also consider usingtext summarization to work with larg

24、e texts.Large bodies of descriptions mayoften appear on content platforms that want to enlist as many product featuresand as much information as possible.While its not the case for professionalnetworks,summarization helps to produce shorter versions of the text whilepreserving key information points

25、.All of the above mentioned methods can be,or rather should,be combined intoa single NLP pipeline.The complexity of NLP processing will depend on thedomain area,goals,and features of the existing system.However,the best8matching results can be provided to the user once a recommender system iscapable

26、 of working with different types of content,different text length and soon.After converting the source text into a set of numeric features,we have to buildan AI model.Here we can use three types of models:lassification model that will predict two possible values:match/no match(1/0).Based on the prob

27、ability of the model prediction,you can choose the mostappropriate matching option.Regression model to predict the coefficient of similarity of two candidates.9Graph-based solution,where the features obtained from the text will be used asadditional graph vertices,and matching will be performed by th

28、e number ofmatching vertices,as shown below.10Where to Apply Recommendations UsingNLPDespite the fact NLP can be superior to standard search capabilities because itallows the user to type their request in a free form,it doesnt mean we cantcombine it with conventional recommendation systems.NLP serve

29、s a specificpurpose here,and does require setting up an infrastructure for data collection,aserver for model operation,and other elements.So this approach fits perfectly in the case that you already have data collected,or there is a relatively small amount of content that you want to recommend thisw

30、ay.This requires a solid data science expertise to analyze the requirements ofyour existing infrastructure,prepare data,and implement NLP methods.As an alternative,we can apply GPT models that have become popular lately.ChatGPT,or GPT 3 represent conversational bot models that can be integratedvia A

31、PI and wont require a huge data collection hassle.This is because they canalready recognize human speech in context and provide recommendationswithout applying any filters.Our AI engineers are always looking for challengingtasks and projects that require using advanced machine learning techniques.Sofeel free to discuss your project vision with us.11

友情提示

1、下载报告失败解决办法
2、PDF文件下载后，可能会被浏览器默认打开，此种情况可以点击浏览器菜单，保存网页到桌面，就可以正常下载了。
3、本站不支持迅雷下载，请使用电脑自带的IE浏览器，或者360浏览器、谷歌浏览器下载即可。
4、本站报告下载后的文档和图纸-无水印,预览文档经过压缩，下载后原文更清晰。

本文（MobiDev：选择构建基于NLP的文本推荐系统的最佳方式（英文版）（14页）.pdf）为本站（Kelly Street）主动上传，三个皮匠报告文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若此文所含内容侵犯了您的版权或隐私，请立即通知三个皮匠报告文库（点击联系客服），我们立即给予删除！

温馨提示：如果因为网速或其他原因下载失败请重新下载，重复下载不扣分。