JSoup A Nice Initiative To Fetch Information From HTML

Spread the love

JSoup To Fetch Information From HTML
JSoup To Fetch Information From HTML

JSoup To Fetch Information From HTML:

Last few (Jsoup Information) days I was trying to get data from a HTML page which is very much dynamic in nature.
In another word I need a good HTML parser to read HTML file.

I tried with few readily available solution. And I am happy to say that among all the solution JSOUP is quite satisfactory.

JSoup A Nice Initiative To Fetch Information From HTML

Let me tell what is good for this..

  • it is a java library that can be easily attached to leading Java editor(I tried with JDeveloper and Eclipse)
  • It can used to fetch and manipulate HTML data.
  • Very useful for the report analysis.
  • It can find the exact data with very easy steps.
  • Very minimal code is required.
  • Useful for structured and unstructured HTML.
  • It is open source and the code is available in Github

JSoup A Nice Initiative To Fetch Information From HTML

The documentation is available here.
The source code is available here.

Very nice example and discussion can be found in the below links..

The jar can be downloaded from here

How To Read Data From HTML Via JSoup In Java?

How To Read Data From HTML Via JSoup In Java

I am having a requirement where an url will be provided to me. where there will be multiple table. Table 1 talks about the summary report and table 2 talks about detailed report.

READ  Class ByteArrayInputStream in Java

My objective is to get data from first table.
I just checked the table class by seeing the source code. It is table[class=details]
In this table, I have one header and one row of data. Header will give me the table header information .
As I am processing a test result so it is having Total test,Pass, fail,time to execute etc info
Lets see how to fetch those info…

How To Read Data From HTML Via JSoup In Java
import java.io.IOException;
import java.net.URL;
import java.util.Iterator;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
public class readHTML {
public String getValues(String url) throws IOException
{
URL getUrl=new URL(gUrl);
Document doc = Jsoup.parse(getUrl, 3000);
Element table = doc.select("table[class=details]").first();
// As i Want to fetch table with details class
Iterator<Element> iteh = table.select("th").iterator();
//This is for fetching header values
String test= iteh.next().text();
String fail= iteh.next().text();
String err= iteh.next().text();
String knonIss=iteh.next().text();
String pass= iteh.next().text();
String skip= iteh.next().text();
String suc_rate=iteh.next().text();
String time=iteh.next().text();
Iterator<Element> ite = table.select("td").iterator();
//This is for fetching row values
String testV=ite.next().text();
String failV=ite.next().text();
String errorV=ite.next().text();
String knownIssueV=ite.next().text();
String passV=ite.next().text();
String skipV=ite.next().text();
String sucv=ite.next().text().split(":")[1].split("%")[0].trim();
String timeV=ite.next().text();
System.out.println("Value of: " +test+ " is " + testV );
System.out.println("Value of: " +fail+ " is " + failV);
System.out.println("Value of: " +err+ " is " +errorV);
System.out.println("Value of: " +knonIss+ " is " +knownIssueV );
System.out.println("Value of: " +pass+ " is " +passV);
System.out.println("Value of: " +skip+ " is " +skipV );
System.out.println("Value of: " +suc_rate+ " is " + sucv);
System.out.println("Value of: " +time+ " is " timeV );
}
}

I have successfully printed this info. Over to you. Try it and let me know what do you feel about jSoup.


Spread the love

Leave a Comment

Your email address will not be published. Required fields are marked *