Introducing PDFUtil – Compare two PDF files textually or Visually

In my project, I need to compare tons of PDF files. I could not find any good FREE library which is working out of the box to compare the PDF files. I did not want just Text compare & I was looking for something which can compare PDFs pixel by pixel to find all the differences.  Libraries which can do are NOT FREE.

So, I have come up with a simple JAVA library (using apache-pdf-box – Licensed under the Apache License, Version 2.0) which can compare given PDF documents in Text/Image mode & highlight the differences, extract images from the PDF documents, save the PDF pages as images etc.

Udemy – Java 8 and Beyond for Testers:

TestAutomationGuru has released a brand new course in Udemy on Java 8 and Beyond for Testers. 13 hours course with java latest features, lambda, stream, functional style programming etc. Please access the above link which gives you the special discount.  You can also get your money back if you do not like the course within 30 days.

Maven Dependency:

Include the below dependency in your POM file.

Download:

PDF compare utility with all the dependencies.


	taguru-pdf-utility-v1.1.zip	(45068 downloads	)


Github:

The source code for this project is here.

Usage:

  • To get page count
import com.testautomationguru.utility.PDFUtil;

PDFUtil pdfUtil = new PDFUtil();
pdfUtil.getPageCount("c:/sample.pdf"); //returns the page count

  • To get page content as plain text
//returns the pdf content - all pages
pdfUtil.getText("c:/sample.pdf"); 

// returns the pdf content from page number 2
pdfUtil.getText("c:/sample.pdf",2); 

// returns the pdf content from page number 5 to 8
pdfUtil.getText("c:/sample.pdf", 5, 8);

  • To extract attached images from PDF
//set the path where we need to store the images
 pdfUtil.setImageDestinationPath("c:/imgpath");
 pdfUtil.extractImages("c:/sample.pdf");

// extracts and saves the pdf content from page number 3
pdfUtil.extractImages("c:/sample.pdf", 3);

// extracts and saves the pdf content from page 2
pdfUtil.extractImages("c:/sample.pdf", 2, 2);

  • To store PDF pages as images
//set the path where we need to store the images
 pdfUtil.setImageDestinationPath("c:/imgpath");
 pdfUtil.savePdfAsImage("c:/sample.pdf");

  • To compare PDF files in text mode (faster – But it does not compare the format, images etc in the PDF)
String file1="c:/files/doc1.pdf";
String file1="c:/files/doc2.pdf";

// compares the pdf documents and returns a boolean
// true if both files have same content. false otherwise.
pdfUtil.compare(file1, file2);

// compare the 3rd page alone
pdfUtil.compare(file1, file2, 3, 3);

// compare the pages from 1 to 5
pdfUtil.compare(file1, file2, 1, 5);

  • To exclude certain text while comparing PDF files in text mode
String file1="c:/files/doc1.pdf";
String file1="c:/files/doc2.pdf";

//pass all the possible texts to be removed before comparing
pdfutil.excludeText("1998", "testautomation");

//pass regex patterns to be removed before comparing
// \\d+ removes all the numbers in the pdf before comparing
pdfutil.excludeText("\\d+");

// compares the pdf documents and returns a boolean
// true if both files have same content. false otherwise.
pdfUtil.compare(file1, file2);

// compare the 3rd page alone
pdfUtil.compare(file1, file2, 3, 3);

// compare the pages from 1 to 5
pdfUtil.compare(file1, file2, 1, 5);

  • To compare PDF files in Visual mode (slower – compares PDF documents pixel by pixel – highlights pdf difference & store the result as image)
String file1="c:/files/doc1.pdf";
String file1="c:/files/doc2.pdf";

// compares the pdf documents and returns a boolean
// true if both files have same content. false otherwise.
// Default is CompareMode.TEXT_MODE
pdfUtil.setCompareMode(CompareMode.VISUAL_MODE);
pdfUtil.compare(file1, file2);

// compare the 3rd page alone
pdfUtil.compare(file1, file2, 3, 3);

// compare the pages from 1 to 5
pdfUtil.compare(file1, file2, 1, 5);

//if you need to store the result
pdfUtil.highlightPdfDifference(true);
pdfUtil.setImageDestinationPath("c:/imgpath");
pdfUtil.compare(file1, file2);

For example, I have 2 PDF documents which have exact same content except the below differences in the charts.

 

pdfu001                                      pdfu002

 

 

My PDFUtility gives the result as given below (highlights the difference in Magenta color by default. Color can be changed).

pdfu003

 

Features to be added soon:

  • While comparing PDFs in VISUAL_MODE, ignore certain area.
  • While comparing PDFs in VISUAL_MODE, return true / false based on certain threshold / sensitivity.

 

Share This:

189 thoughts on “Introducing PDFUtil – Compare two PDF files textually or Visually

    1. It is not on github – I do not have any issues in sharing with others. Please give me sometime. I will share with you ASAP.

      1. Hi, there’s been many requests for the source code to be shared. This is another +1

        Hope you can get to it sometime. While on the subject, I think it would be nice if you shared/released the source code of future tools/utilities that you offer the binary for download (if/where you have no reservations or restrictions for sharing the source). It’s a lot easier to do when you make that an intent from the beginning.

        And the lamest but still good approach would be to just tar/zip up the source code (with ideally OSS license) and offer that for download in addition to the binary, if you don’t want to deal with git/source control.

  1. Thanks for such a wonderful explanation.

    Can you please cahre the libraby with me on email id.

    Thanks

    Sachin A
    India

  2. Can you push it on git. it has quite a lot of potential and i would like to contribute to your code. Thanks – Abhishek

  3. Can you mail me the documentation of the code for easy understanding.
    I also have the similar project. And I find your work as brilliant. It will be very useful if u share how the compare works and how result is shown? Thanks in advance

  4. I am using eclipse to run your code. The compare block throws error as

    “Nov 02, 2015 6:01:39 PM org.apache.pdfbox.util.PDFStreamEngine processOperator
    INFO: unsupported/disabled operation: BDC
    Nov 02, 2015 6:01:41 PM org.apache.pdfbox.util.PDFStreamEngine processOperator
    INFO: unsupported/disabled operation: EMC”

    Where and how should I run this code to get the highlighted differences?
    Also, Is it possible for me to have this pdf comparer as a web service?

  5. I am currently working on a pdf comparer project. We are working on highlighting the differences between two pdfs.Above code helped us to compare pdfs. But the output is highlighted and overlapping..Can you please send us your source code that will help us to make changes on the above result.

      1. Yes, it is expected as it compares pixel by pixel. so for a very small change, it will highlight – so you might see as it overlaps. If you expect text mismatch, please do text compare.

  6. Very nice util !

    …one question…..

    sometimes it’s nice to have a method which enables you to exclude some part of the PDF file … by making use of page area’s which one can select or deselect….

    If you let me access the source I can make some extentsions for all of us……

    anyway… nice job !

    1. You can get the content of the PDF as text. Then you can apply the logic yourself to find the mismatch. That should be very easy to implement.

  7. Hi

    we are using this, first we should thank for such a great work you provided to us. Thank you very much!

    My two PDF documents have 16 differences, but comparePdfFilesBinaryMode(file1, file2); method is showing only 13 differences(screenshots). How should we overcome this problem?

    Any Suggestions? I am looking for Optical character recognising (ocr)jar files to overcome this.

  8. Hi, vlns, do you have source codes posted on, e.g., github? I would like to use and contribute your project too.

  9. Hi,
    I am trying to compare two files and get following error:
    Feb 05, 2016 12:59:10 PM org.apache.pdfbox.util.operator.pagedrawer.Invoke process
    WARNING: getRGBImage returned NULL
    Feb 05, 2016 12:59:10 PM org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap getRGBImage
    SEVERE: java.lang.NegativeArraySizeException
    java.lang.NegativeArraySizeException

    Looks like it is problem with PDFBOX.jar
    What I can see you are using ver 1.8.9
    but there is version 2.0 RC
    Can you provide your tool with updated PDFBOX to check if this will fix my problem?
    Thanks in advance

    1. Yes, That is right.
      pdfbox.jar is a separate jar in the PDFUtil. You can just replace the pdfbox.jar with the latest one. Thanks for pointing it out.

  10. Hi,
    thanks for info,
    I updated pdfbox to ver 20.0.-rc3
    I get following error:

    Exception in thread “main” java.lang.NoSuchMethodError: org.apache.pdfbox.pdmodel.PDDocument.load(Ljava/lang/String;)Lorg/apache/pdfbox/pdmodel/PDDocument;
    at com.taguru.utility.PDFUtil.getPageCount(PDFUtil.java:160)

    I am using IntelliJ idea 15 community edition.
    Do you know how to fix this?

    1. Somehow this comment went to spam. Not sure why.
      Anyway for your question – Can you please see if you can use pdfbox-app-1.8.11.jar.?

  11. Hi great work. i am very much interested . As per our project needs, we need to skip som of the sections in the PDF from comparing. it would be helpful if you share open jar file with us. Thank you

  12. Interested in your api. Looking forward for a PDF comparison requirement., Would be grateful if you can share this source/jar file to try

    SJS

  13. Hi Vls,

    Congratulations to have created such a nice tool.

    Will you share the code? And if you are not supposed to share the code can you at least tell us you intention?

    It looks like you are not answering to every request about sharing the code so it is unclear whether you will actually do it.

    Thanks
    Simone

  14. Wow, nice work ! This really saves me a lot of work, manually comparing hundreds of pdfs
    Currently it does not seem to run under Java 8 . :-/ Is there an upgrade planned ?

  15. I have used the functions and plugin, but we are not able to save the image as said in the last section. i.e, comparing two pdf files and highlighting the differences and writing it to in an image file. Could you please help in the regard. Piece of code is something like below.

    pdfutil.highlightPdfDifference(true);
    pdfutil.setImageDestinationPath(Path+”//results//”);
    //pdfutil.savePdfAsImage(Path);
    System.out.println(pdfutil.comparePdfFilesTextMode(Doc_BaseLine, Doc_Actual));
    // pdfutil.comparePdfFilesBinaryMode(Doc_BaseLine, Doc_Actual);
    System.out.println(pdfutil.comparePdfFilesBinaryMode(Doc_BaseLine, Doc_Actual));
    //pdfutil.extractImages(Doc_Actual) ;
    pdfutil.savePdfAsImage(Doc_Actual);

  16. Hi,
    First of all, Thank you. It helped me a lot. But as per my project, i need to skip some of the sections in the PDF from comparing. it would be helpful if you share open jar file with us so that i can make changes as per need.
    Thank you

  17. Hi, rather saving the compared image to specific path I want to download the compared PDF output image file , is it possible ?? if yes plz suggest me solution for it .. Thanks

  18. Hi,

    I’ve posted multiple comments here but none are actually showing up. I would really like to use this could you please help me?

    1. After downloading the ZIP file (which contains 2 JARs), what are the exact steps to compare 2 PDFs?

    2. Could you send me a link to the source code as well?

    Thanks!

    1. I was not even to able to login to my blog due to some issues related to wordpress blog recent update! So could not answer your question.

      Check this link to include the downloaded jar files in eclipse.

      Once added, you should be able to use below code

      import com.taguru.utility.PDFUtil;

      PDFUtil pdfUtil = new PDFUtil();
      pdfUtil.getPageCount("c:/sample.pdf"); //returns the page count

  19. Wonderful information and Amazing explanation !!! 🙂

    I have downloaded the Zip file and when I am trying to extract the Zip unfortunately it is showing an error as “Cannot Open file: it does not appear to be a valid archive”

    could you please resend that valid Zip file to my email id

    Thanks in Advance !!!

  20. Thank you so much for sharing this tool! Would you be able to please share the source code? I need to modify it to ignore certain parts of the file and remove special unicode certain characters from the PDF file before comparing as it’s throwing off the comparison.

    Please let me know when you share it. Thanks.

  21. Hi

    The library is promising. Can you share the source code with us? And can you point us to some documentation? Say I want to change the colour of comparison from Magenta to Green, how do I do that?

  22. Hi VLNS,

    I’ve messaged multiple times asking if the source code is available for this. Kindly let me know if it’s not so I can start my own implementation 🙂 Just don’t want to waste time implementing something from scratch if I can just build on yours so please let me know soon.

    Thanks.

  23. Nice Tool!
    Is there a way to declare wildcards in binaryMode? I generate daily pdf reports with the current date on it. The textMode is not accurate enough for my pdf files.

    It would be nice if you can declare wildcards (a region on the file may).

  24. Neat utility which great potential.

    I think it would be really good if ignore rules could be added based on some RegEx

  25. Neat utility with great potential.

    I think it would be really good if there was a way to ignore certain text by adding some Regex rules

  26. It’s great work and indeed. don’t mind can you share the source code so that we can contribute to utlize in all the possible requirements?

  27. Hi, I tried to compare pixel by pixel for 2 PDFs using below code. But am not getting the image, which highlights the difference between the 2 PDFs

    //if you need to store the result
    pdfUtil.highlightPdfDifference(true);
    pdfUtil.setImageDestinationPath(“c:/imgpath”);
    pdfUtil.comparePdfFilesBinaryMode(file1, file2);

  28. I would like to call the JAR file in VB Script or from UFT 12.53.

    Can you please guide me. Sample VB script i have attached . But unable to pass the command ( getPageCount) & arguments (“c:/sample.pdf”)

    Set WshShell = CreateObject(“WScript.Shell”)
    dim a
    a = “C:\PDF Compare\taguru-pdf-utility-v1.0\taguru-pdf-utility-v1.0\pdfbox-app-1.8.9.jar”
    WshShell.Run “java -jar ” & chr(34) & a & chr(34)

  29. Hi,

    Function convertToImageAndCompare(String file1, String file2, int startPage, int endPage) having issues, not returning anything and also unable to generate Results to a folder with the following :

    String file1 = “resources/July 16th.pdf”;
    String file2 = “resources/July 17th.pdf”;
    util.highlightPdfDifference(true);
    util.setImageDestinationPath(“/Users/test/Errors”);
    util.comparePdfFilesBinaryMode(file1, file2);

    Do we need to give any file name? tried to debug the code but it only returns only true or false, the code under the convertToImageAndCompare is commented and the function comparePdfFilesBinaryMode is calling convertToImageAndCompare which is not returning anything and getting the error.

    Thanks,
    Jeevan

  30. Trying with pdfbox-app-2.0.2and get the following error:

    Exception in thread “main” java.lang.NoSuchMethodError: org.apache.pdfbox.pdmodel.PDDocument.load(Ljava/lang/String;)Lorg/apache/pdfbox/pdmodel/PDDocument;
    at com.taguru.utility.PDFUtil.getPageCount(PDFUtil.java:160)
    at com.taguru.utility.PDFUtil.comparePdfByImage(PDFUtil.java:459)
    at com.taguru.utility.PDFUtil.comparePdfFilesBinaryMode(PDFUtil.java:402)
    at ERS.UnitTest.Reports.ComparePDFDocuments.Compare2Documents(ComparePDFDocuments.java:27)
    at ERS.UnitTest.Reports.ComparePDFDocuments.main(ComparePDFDocuments.java:19)

    Please advise.

  31. Can you please share the source code or post it in GitHub. I want to contribute in the project too. I think there have been many requests regarding this.

  32. Hello,

    I also would like to ask you about your tool.
    Could you please share source code or give me a link?

    Thanks in advance!

  33. Hi! How about the sources? I would really line to submit some enhancements and perhaps look into the regex-excludes mentioned above.

  34. Its a Nice utility !! I just tried using it. I have used the method ‘comparePdfFilesBinaryMode’. it compared the pdfs but result image is generated only for first page of pdf though there are differences in second page too.
    Please suggest if there is any other way to generate images for each page ?

    1. Yes, Please check the API – you need to set the flag to compare all the pages. otherwise, it will just return false as soon as it finds a mismatch and exit.

  35. while comparing the pdfs, I wanted to ignore few differences (like form IDs) in the PDFs and make them pass irrespective of few kinds of differences in them.
    Can you please share source code in order to make this change for my project.

    1. The PDFUtil was created by me and was poorly designed 🙁 ..this is the reason i am delaying to post in github. I will work on those and upload it in github very soon.

  36. Hi,
    Is it possible to share a demo video on how to use this library file to compare PDF files in Visual mode. Or Steps to do this?
    Thanks.

  37. Hi,

    I am running automation to compare more than one pair of PDF. i would like to save all the compared images into a output folder. But, your code seems to clean the folder before writing to it.

    1. Yeah!! I thought I should clear the folder. But you are right. This library should not do that. It is upto the user to decide to clear or not. One easy option is, under the output folder, you can create a separate folder for each pdf. Or the sourcode is available in github. You can comment the code which clears the output folder & build it. I have provided the build instruction.

    1. No, for the time being! But you can do this yourself. pdfUtil.getText("c:/sample.pdf").replaceAll("[0-9]{2}\\[0-9]{2}\\[0-9]{4}", "") it will remove the date and give you the string for compare.

      1. This is working fine for text compare. But in case of image comparison it is failing, is there a way out to either remove this from PDF ?

        1. In case of image, it does pixel by pixel compare. It is very sensitive comparison & masking certain is not very simple and straightforward approach.

  38. Can you please let us know how to compare a pdf when it has a watermark or watermark layer on it. Also can you please let us know how to delete that water mark .This utility helped us greatly.

  39. Hi , I am trying to use this utility in vbscript , my requirement is compare two pdf files in commandline and generate the difference file in specific location .. using Jar i am unable to set the image destination path in commandline ..please guide
    1.set destination image path in commandline
    2.compare the images in commandline and save the difference image

  40. We had the same requirement, I modified the main class and re-built the jar using the maven build file. We found that the comparison needed to be page by page or else you don’t get a diff image per page, also we needed a non-zero System exit value to get it to be useful in the test environment.

  41. Exception in thread “main” java.lang.UnsupportedClassVersionError: com/testautomationguru/utility/PDFUtil : Unsupported major.minor version 52.0

    Facing this issuewhile running java program. Could you please help me with this how to resolve it?

  42. Hi I am finding issue when both of the images are having difference , then the resulting image is not highlighting the difference . Also this would be nice if you can make a side by side comparison

    1. Number of pixels change could be very less – may be 1 – that is why you are unable to notice the difference. I will see if we can have some threshold.

  43. Hi, If possible can you please help me with Watermark removal code ,it is really important for me and that will be of great help to me if you can do that .

  44. hi, i want to compare pdf files pixel by pixel but this is not comparing can you show to me how to execute this code

  45. This is really very nice.
    And i would like to know that whether the below features which are added soon is available in github.
    While comparing PDFs – ignore certain text using Regular Expression
    For example, 2 PDFs have same text & contains date on which it was generated which needs to be omitted while comparing.
    While comparing PDFs in VISUAL_MODE, ignore certain area.
    While comparing PDFs in VISUAL_MODE, return true / false based on certain threshold / sensitivity.

  46. Thank you so much vlns… I am new to Selenium and do not understand Git, anyway I was able to download the jar file. Just wanted to know what import command do I need to write in my eclipse after adding this jar to my Reference Library

      1. Thank You vlns. This helped me. But my PDF has some 4-5 lines in the bottom of each page that contain image having dynamic content. Is there a way I can crop them out (remove/Ignore them) before comparison in Visual Mode. Please help.

  47. Hi Vlns
    your work is marvelous, But Pixel by pixel comparison is much slower when compared to a Licensed tool(StreamDiff). Do you have any idea to increase the speed of comparison?

    Will Wait for your response!

  48. Hi Vlns,

    This works wonders…Thank you so much.

    But for pixel by pixel comparison, my PDF have 3 pages, and there were some differences on all 3 pages but the Result Image that captures the Difference only shows the same for 1st page only. Can you please help on this – as to how to showcase the differences on all the pages of PDF and not just the 1st Page.

    1. That is the default behavior to exit as soon as a mismatch is found. if you want all pages to be compared, you could set – pdfUtil.compareAllPages(true) – before comparing.

      1. Thank You Vlns. This worked. But in case the no. of pages in the PDF are not same, then this does not spot the difference in the image format. Is there any way I can capture the Pixel difference in all the pages even if page count does not match?

  49. Hi Team,

    I admire your work very much. I want to bring to your notice that image generated by below statement
    pdfRenderer1.renderImageWithDPI(iPage, 72, ImageType.RGB
    is https://drive.google.com/file/d/0B18WGCjoaDzJQXVnYVdDand3cWs/view
    which is odd, and time it takes to compare singe page of two pdfs is 5 secs on an average.
    Could you please suggest any resolution to correct image generation and increase speed of comparison?
    Waiting for your response( positive or diplomatic, ready to receive)

  50. HI

    I am comparing two PDFs and i have enabled the logs too.
    ArrayIndexOutOfBound is coming:
    WARNING: The end of the stream doesn’t point to the correct offset, using workaround to read the stream, stream start position: 5903, length: 0, expected end position: 5903
    Apr 02, 2017 9:20:44 AM com.testautomationguru.utility.PDFUtil convertToImageAndCompare
    INFO: Comparing Page No : 1
    Apr 02, 2017 9:20:44 AM org.apache.pdfbox.contentstream.PDFStreamEngine operatorException
    WARNING: Image stream is empty
    java.lang.ArrayIndexOutOfBoundsException: Coordinate out of bounds!
    at sun.awt.image.IntegerInterleavedRaster.getDataElements(IntegerInterleavedRaster.java:219)
    at java.awt.image.BufferedImage.getRGB(BufferedImage.java:986)
    at com.testautomationguru.utility.ImageUtil.compareAndHighlight(ImageUtil.java:19)
    at com.testautomationguru.utility.PDFUtil.convertToImageAndCompare(PDFUtil.java:458)

    Need your email id so that can share the PDFs

  51. This is an awesome utility. I have one doubt, we want to change the color of the difference and I don’t want to overlap the difference, instead want it to shift towards left side. Will that be possible?

  52. And what do you think on shifting the differences to the left of the source pdf file. Actually we have two pdf files printing prices of the resources, And we want to compare the differences. But due to overlapping we can’t read baseline and actual file.

  53. Hi Vlns,
    This utility is really helpful, but i am facing one issue actually i used this utility as a jar and used in my class and passing “pdfUtil.setImageDestinationPath(DestPath);” DestPath – i am passing as string with two PDFs, — “public static String pdfMatchMethod(String pdf1Path, String pdf2Path, String DestPath ) “and converted my class as webservice, but while calling my class in Client proxy class and passing these parameters, image is not getting saved in the desired given path.

    Can you please help me in solving this issue, why it is not downloading from webservice to local.

  54. can you please tell me how do i compare the font and alignment of pdf’s with this library and with this library its not storing the differed image

    pdfUtil.setImageDestinationPath(“c:/imgpath”);
    pdfUtil.compare(file1, file2);

    no exception but no image in the folder as well

  55. Hi Vlns,
    This utility is really helpful.When we tried to use the jar and run it But no output was getting printed .Could you please help

  56. Hi vlns,

    If the two pdf been compared are completely different then no image is generated, can you please share some info for this behavior.

    1. Yes, the very first check to do pdf compare is to match the number of pages in both pdfs. if they do not match, then it immediately fails.

  57. Hi, I want to change the color of the highlighted difference in the result image, Where should I change also the comparison seems to be overlapping, So where to change to give gap

  58. Hi,
    I have 2 files ,File A – 5 pages and File B- 10 pages . Is it possible to compare first 5 pages in FIle A and File B ? because i’m getting error as “files page counts do not match – returning false” when i try to compare these files.

  59. Hi Team,

    // compare the pages from 1 to 5
    pdfUtil.compare(file1, file2, 1, 5);
    The above comparison doesn’t work for me .It always compares the first page in file1 and file2 and skips the rest of the pages .Please help on this

  60. it’s a very useful tool, thanks for sharing.

    may i know when this tool could support ignore some certain area/content that not compare with visual comparison mode? thanks a lot in advance.

  61. Hi ,

    Could you please provide me the code to compare two pdf files and print the mismatches please in a seperate image file,

  62. Hi Vins,
    This utility is really helpful, thanks for sharing it 🙂

    Quick clarification
    Below one is not working and It always compares the first page in both the files and skips the rest of the pages .Please help me on this
    pdfUtil.compare(file1, file2, 1, 3);

    1. As it is because, it already found a mismatch. so there is no point in proceeding further. PDF compare is little bit time consuming. So by exiting early could save us sometime. Before comparing you could use pdfUtil.compareAllPages(true); would compare all pages

      1. Thanks vIns, it’s comparing all the pages now..
        No image saved if there is no change in that particular page..
        Say for example I have PDF with 2 pages.. there is some difference in 1st page and no change in 2nd page.. in this scenario it saves image only for 1st page since no change in the 2nd page..but I want image to be saved for 2nd page as well.. could you please help me on this

        1. Hi,

          Can anyone help me why I am getting the below error.

          xception in thread “main” java.lang.UnsupportedClassVersionError: sikuli/nm/gh : Unsupported major.minor version 52.0
          at java.lang.ClassLoader.defineClass1(Native Method)
          at java.lang.ClassLoader.defineClass(ClassLoader.java:791)
          at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
          at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
          at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
          at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
          at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
          at java.security.AccessController.doPrivileged(Native Method)
          at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
          at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
          at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
          at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
          at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:472)
          Picked up JAVA_TOOL_OPTIONS: -agentlib:jvmhook
          Picked up _JAVA_OPTIONS: -Xrunjvmhook -Xbootclasspath/a:”C:\Program Files (x86)\HP\Unified Functional Testing\bin\java_shared\classes”;”C:\Program Files (x86)\HP\Unified Functional Testing\bin\java_shared\classes\jasmine.jar”

          code used is PDFUtil pdfUtil = new PDFUtil();
          //pdfUtil.setCompareMode(CompareMode.VISUAL_MODE);

          String file1="C://Program Files//Java//eclipse//capture.pdf";
          String file2="C://Program Files//Java//eclipse//capture1.pdf";

          // compares the pdf documents & returns a boolean
          // true if both files have same content. false otherwise.

          //pdfUtil.compare(file1, file2);

          // compare the 3rd page alone
          ((PDFUtil) pdfUtil).compare(file1, file2,1,1);

  63. vIns, Its very useful.. thanks..
    But no exception generated and no image stored in the folder for few pdfs
    I didnt get any logs in console.
    could you please help me in enabling the logs in pdfutil

  64. Hi,
    I tried to compare two images using below code
    //if you need to store the result
    pdfUtil.highlightPdfDifference(true);
    pdfUtil.setImageDestinationPath(“c:/imgpath”);
    boolean results= pdfUtil.compare(file1, file2);
    System.out.print(results);
    pdfUtil.enableLog();

    running the code in intellj.. am getting below message
    Jul 18, 2018 11:59:37 PM org.apache.pdfbox.pdmodel.font.PDCIDFontType2
    INFO: OpenType Layout tables used in font CIDFont+F1 are not implemented in PDFBox and will be ignored
    false
    Process finished with exit code 0

    but I cant see the output in my c drive. please help and advise

  65. Very helpfull and easy to use library.
    I want to vote for the feature “ignore certain areas in visual mode”.
    Then it would be perfect for us.

  66. Hi, this is a very good library, I tried to run it from mac and set imagedestination path as : /Users/mymac/Documents/imagepath, but image result is not generated and also no error in console, am i missing anything here ?
    My code:
    PDFUtil pdfUtil = new PDFUtil();
    pdfUtil.setCompareMode(CompareMode.VISUAL_MODE);
    pdfUtil.highlightPdfDifference(true);
    pdfUtil.setImageDestinationPath(“/Users/mymac/Documents/imgpath”);
    pdfUtil.compare(file1, file2);an

    1. Do you have permissions to write the result in the directory? Also you can enable log to see whats going on! pdfUtil.enableLog(level)

      1. Not OP. Enabling the logging [pdfUtil.enableLog();] helped me with this issue.
        Showed that the 2 PDFs were not the same page length which totally prevent any image being produced.

        Is there a way to get an image of every page difference or does it stop at the first difference?

    1. Not as of now. But it can provide the content for the given pages. So you can do the comparison yourself by getting the text.

  67. Thank you so much for this utility. It helped me a lot. However my requirement wants the functionality where I need to ignore few parts of PDFs while making the comparison. I have used your projects and made further changes to accommodate my requirement. Appreciate your efforts. Thanks again.

  68. Really this tool is good. I expect one suggestion.
    How to print result pdf’s without actual contents. I expect tool should highlight both actual and expected file changes individually and should not write any contents from actual to expected or expected to actual.

  69. is this tool still working an can be used because I saw on Maven site that last time changes had been made about 2 years ago? Please, let me know

    Thanks

    Jeff

      1. Hi Vins,

        Good to know. Now, how can I know if possible what are the differences in terms of text sections? This is when I choose just text comparison? Is it a way to define a threshold in terms of percentage of differences for test passes? mean like it’s only 10% , it’ll be OK to pass. Also, do you know by any chance if Adobe provides some sort of API to compare several PDF files? Please, let me know

        Thanks

        Jeff

      2. Oh, can you please confirm that I’ll be able for example exclude for example date from files and then composer visually? Example only show that approach to Tex comparison. Please let me know.

        Thanks

        Jeff

        1. Text compare is relatively easy. You can exclude text. Visual compare is difficult. Currently this utility does not support. I have already provided all the methods here.

  70. Hello Vinoth Selvaraj,

    first of all thanks, that you share with people. I would like to clear the subject ‘license’. Is it GNU? May I use it in non-profit or commertial appls?

    Dmitrii

  71. Hi,
    at the end of the article stays:
    Features to be added soon:
    While comparing PDFs in VISUAL_MODE, ignore certain area.

    are you so far? Can be configured which area to be ignored?

    Thank you very much,
    Robert

      1. Thanks for the response. The problem for example: Two PDFs document have same Date format but the dates are different: i-e: One document has date: 05/03/21 the other has: 23/03/21. Since this method using: equalsIgnoreCase(), which does not see the date format but it sees if they are ==. Which implies test to get failed cause dates are different, I want to ignore these dates simply before comparison.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.