identify the correct CSS selector of a url for an R script -


i trying obtain data website , helper following script:

require(httr) require(rvest)       res <- httr::post(url = "http://apps.kew.org/wcsp/advsearch.do",                      body = list(page = "advancedsearch",                                  attachmentexist = "",                                  family = "",                                  placeofpub = "",                                  genus =      "arctodupontia",                                  yearpublished = "",                                  species ="scleroclada",                                  author = "",                                  infrarank = "",                                  infraepithet = "",                                  selectedlevel = "cont"),                      encode = "form")    pg <- content(res, as="parsed")   lnks <- html_attr(html_node(pg,"td"), "href") 

however, in cases, example above, not retrieve right link because, reason, html_attr not find urls ("href") within node detected html_node. far, have tried different css selector, "td", "a.onwardnav" , ".plantname" none of them generate object html_attr can handle correctly. hint?

you close on getting answer expecting. if pull links off of desired page then:

lnks <- html_attr(html_nodes(pg,"a"), "href")  

will return list of of links @ "a" tag "href" attribute. notice command html_nodes , not node. there multiple "a" tags plural.
if looking information table in body of try this:

html_table(pg, fill=true) #or html_nodes(pg,"tr") 

the second line return list of 9 rows table 1 parse obtain row names ("th") and/or row values ("td").
hope helps.


Comments

Popular posts from this blog

java - nested exception is org.hibernate.exception.SQLGrammarException: could not extract ResultSet Hibernate+SpringMVC -

sql - Postgresql tables exists, but getting "relation does not exist" when querying -

asp.net mvc - breakpoint on javascript in CSHTML? -