How it works: Selenium
I've done quite a bit of work with Selenium on and off, mainly for browser automation tests. There are many high quality official libraries for Selenium covering a range of languages. It's an amazing technology and a real joy to work with; however sometimes stuff can go wrong and it's useful to know what's going on under the hood in order to debug where the problem lies.
For this reason I present my very-probably-wrong guide to Selenium.
Gecko, drivers!? What's going on?
There are 3 components in code that interacts with Selenium (ignoring RemoteWebDriver
use cases):
- Your code
- A driver
- The browser
For each browser that supports Selenium, the browser vendor provides a "driver". This is usually a small executable (.exe) program tailored for a specific browser:
- Chrome / Chromium
- Firefox
- Edge
- Other drivers available
This driver is the part your code actually interacts with. The driver then communicates with the browser using whatever magic the vendors design. In order to change browsers while using the same code, the drivers expose interfaces/endpoints that implement the WebDriver specification.
Put simply, this means the drivers are a simple RESTful API.
Demo
To demonstrate this I'll use the ChromeDriver version 2.29 which is the latest version at the time of writing. Drivers can only talk to the versions of the browser they are compatible with. In this demo I'm running Chrome on version 58.