Overview:
In this article, we will see how we could use Selenium WebDriver to find broken links on a webpage.
Utility Class:
To demonstrate how it works, I would be using below simple utility. This below method simply returns the HTTP response code the given URL.
public class LinkUtil {
// hits the given url and returns the HTTP response code
public static int getResponseCode(String link) {
URL url;
HttpURLConnection con = null;
Integer responsecode = 0;
try {
url = new URL(link);
con = (HttpURLConnection) url.openConnection();
responsecode = con.getResponseCode();
} catch (Exception e) {
// skip
} finally {
if (null != con)
con.disconnect();
}
return responsecode;
}
}
Usage:
Rest is simple. Find all the elements which has the href / src attribute and by using the above utility class, we could collect the response codes for all the links and groups them based on the response codes.
driver.get("https://www.yahoo.com");
Map<Integer, List<String>> map = driver.findElements(By.xpath("//*[@href]"))
.stream() // find all elements which has href attribute & process one by one
.map(ele -> ele.getAttribute("href")) // get the value of href
.map(String::trim) // trim the text
.distinct() // there could be duplicate links , so find unique
.collect(Collectors.groupingBy(LinkUtil::getResponseCode)); // group the links based on the response code
Now we could access the urls based on the response code we are interested in.
map.get(200) // will contain all the good urls
map.get(403) // will contain all the 'Forbidden' urls
map.get(404) // will contain all the 'Not Found' urls
map.get(0) // will contain all the unknown host urls
We can even simplify this – just partition the urls if the response code is 200 or not.
Map<Boolean, List<String>> map= driver.findElements(By.xpath("//*[@href]")) // find all elements which has href attribute
.stream()
.map(ele -> ele.getAttribute("href")) // get the value of href
.map(String::trim) // trim the text
.distinct() // there could be duplicate links , so find unique
.collect(Collectors.partitioningBy(link -> LinkUtil.getResponseCode(link) == 200)); // partition based on response code
Simply we could access the map to list all the bad urls as shown here.
map.get(true) // will contain all the good urls
map.get(false) // will contain all the bad urls
Print all the bad URLs.
map.get(false)
.stream()
.forEach(System.out::println);
Happy Testing & Subscribe 🙂