Skip to content

This set of functions allows you to simulate a user interacting with a website, using forms and navigating from page to page.

  • Create a session with session(url)

  • Navigate to a specified url with session_jump_to(), or follow a link on the page with session_follow_link().

  • Submit an html_form with session_submit().

  • View the history with session_history() and navigate back and forward with session_back() and session_forward().

  • Extract page contents with html_element() and html_elements(), or get the complete HTML document with read_html().

  • Inspect the HTTP response with httr::cookies(), httr::headers(), and httr::status_code().

Usage

session(url, ...)

is.session(x)

session_jump_to(x, url, ...)

session_follow_link(x, i, css, xpath, ...)

session_back(x)

session_forward(x)

session_history(x)

session_submit(x, form, submit = NULL, ...)

Arguments

url

A URL, either relative or absolute, to navigate to.

...

Any additional httr config to use throughout the session.

x

A session.

i

A integer to select the ith link or a string to match the first link containing that text (case sensitive).

css, xpath

Elements to select. Supply one of css or xpath depending on whether you want to use a CSS selector or XPath 1.0 expression.

form

An html_form to submit

submit

Which button should be used to submit the form?

  • NULL, the default, uses the first button.

  • A string selects a button by its name.

  • A number selects a button using its relative position.

Examples

s <- session("http://hadley.nz")
s %>%
  session_jump_to("hadley-wickham.jpg") %>%
  session_jump_to("/") %>%
  session_history()
#> Warning: Not Found (HTTP 404).
#>   https://hadley.nz/
#>   https://hadley.nz/hadley-wickham.jpg
#> - https://hadley.nz/

s %>%
  session_jump_to("hadley-wickham.jpg") %>%
  session_back() %>%
  session_history()
#> Warning: Not Found (HTTP 404).
#> - https://hadley.nz/
#>   https://hadley.nz/hadley-wickham.jpg

# \donttest{
s %>%
  session_follow_link(css = "p a") %>%
  html_elements("p")
#> Navigating to <http://rstudio.com>.
#> {xml_nodeset (16)}
#>  [1] <p class="h5">See you in Seattle August 12-14!</p>
#>  [2] <p>Securely share data-science applications<br>\n across your team ...
#>  [3] <p>Our code is your code. Build on it. Share it. Improve people’s  ...
#>  [4] <p>Take the time and effort out of uploading, storing, accessing,  ...
#>  [5] <p class="sh4 uppercase mb-[8px] text-blue1">\n            Custome ...
#>  [6] <p class="mt-[8px] body-md-regular text-blue1/[.62]">\n            ...
#>  [7] <p class="mt-[16px] body-md-regular text-neutral-blue62 line-clamp ...
#>  [8] <p class="mt-[16px] body-md-regular text-neutral-blue62 line-clamp ...
#>  [9] <p class="description body-lg-regular text-neutral-light/70" style ...
#> [10] <p class="body-sm-regular text-blue1/[.62] mt-[25px]">\n           ...
#> [11] <p class="ui-small uppercase text-blue1">\n                        ...
#> [12] <p class="ui-small uppercase text-blue1">\n                        ...
#> [13] <p class="ui-small uppercase text-blue1">\n                        ...
#> [14] <p class="ui-small uppercase text-blue1">\n                        ...
#> [15] <p class="ui-small uppercase text-blue1">\n                    con ...
#> [16] <p class="body-md-regular body-sm-regular">We use cookies to bring ...
# }