RSS 2.0 Feed
RSS 2.0


Atom 1.0 Feed
Atom 1.0

  Generating Thumbnails for PDF Pages 


I was making some changes to a website where I had some PDF files and I wanted to be able to post thumbnail images of the PDF file. There were enough PDF files for me to want to take the lazy route and write some code to do it for me. I didn't want to go out and get some library that might have been able to do this for me, so I started poking around to see what I might already have to get the job done quickly.

Turns out that Adobe Acrobat Professional does expose quite a bit via COM. So I decided to see how far I could get with it to accomplish the task of generating thumbnail images for my PDF files. Well, guess what? It worked. Sort of. I could successfully open the documents and get a reference to a page in the PDF, the problem was then getting an image of the page for the thumbnail. But, the page class does have a CopyToClipboard method where you can copy the currently referenced page to the clipboard (and even specify the rect coordinates you want to copy). While I am not thrilled about using the clipboard, I couldn't find any other route to get the task done, so I decided to go with that. Once you get the page copied to the clioboard, it is easy enough to get the clipboard data as an image and use it however you need.

So I created an app that traversed the PDF images in a directory creating a thumbnail for each one. Pretty easy. Not the fastest thing ever, and it does use the clipboard so that rules out using this from a serviced context, but all in all it got the job done with flying colors. I put together a scaled down version of the app as a demo so I could post about it here. Here's the code in a simple form to generate a thumbnail image for the first page in the PDF file:

// add reference to "Acrobat" COM server defined in "acrobat.tlb" 
// add using directives 
using System.Runtime.InteropServices;
using System.Drawing;
//...


Acrobat.CAcroPDDoc doc = null;
Acrobat.CAcroPDPage page = null;

try
{
    // instanciate adobe acrobat
    doc = (Acrobat.CAcroPDDoc)new Acrobat.AcroPDDocClass();

    if (doc.Open(@"C:\MyFile.pdf"))
    {
        if (doc.GetNumPages() > 0)
        {
            // get reference to page
            // pages use a zero based index so 0 = page1
            page = (Acrobat.CAcroPDPage)doc.AcquirePage(0); 

            // get dimensions of page and create rect to indicate full size
            Acrobat.AcroPoint pt = (Acrobat.AcroPoint)page.GetSize();
            Acrobat.CAcroRect rect = new Acrobat.AcroRectClass();
            rect.Top = 0;
            rect.Left = 0;
            rect.right = pt.x;
            rect.bottom = pt.y;

            // copy current page to clipboard as image 
            page.CopyToClipboard(rect, 0, 0, 100);

            // get image from clipboard as bitmap
            IDataObject data = Clipboard.GetDataObject();
            Bitmap bmp = (System.Drawing.Bitmap)data.GetData(DataFormats.Bitmap);

            // calculate new height and width for thumbnail and maintain aspect ratio
            int h = (int)((double)pt.y * ((double)100 / (double)pt.x));
            int w = 100;

            // create thumbnail
            Image img = bmp.GetThumbnailImage(w, h, null, IntPtr.Zero);
            img.Save(@"C:\MyThumbnail.jpg", System.Drawing.Imaging.ImageFormat.Jpeg);
        }
    }
}
catch
{
    // if we get here and doc is null then we were unable to instanciate Acrobat
    if (doc == null) MessageBox.Show("Acrobat is not installed. Adobe Acrobat is required.");
}
finally
{
    if (page != null) Marshal.ReleaseComObject(page);
    if (doc != null) Marshal.ReleaseComObject(doc);
}

One thing to mention. The CopyToClipboard method does allow you to specify “zoom“, so why not just use that to size the thumbnail? When you specify a smaller zoom ratio, the image copied to the clipboard is still the size of the entire original document but has the image of the page sized smaller in the corner. Not exatly what I wanted. So I get the full image and then size it myself. Also, if you wanted, you don't have to size it smaller if you also wanted a full size image of the page.

I threw together a small sample app using the code above to display and save any page from a PDF file.

Feel free to download the code for your own use (.NET 2.0 required and Acrobat 7.0 Pro is also required. Version 7.0 is only required since that is the version my interop DLL is generated from although this does work with other lower versions as well)




                   



Leave a comment below.

Comments

  1. Eber Irigoyen 1/31/2006 2:07 PM
    Gravatar
    I like the end result, but there are two things that I don't like about the implementation

    - first, using bitmaps would consume tons of memory and is very slow as you already mention
    - using the clipboard to transfer the image, is insecure and you have to ask your self "what if two programs did this?"
    http://blogs.msdn.com/oldnewthing/archive/2005/06/07/426294.aspx

    so this might be good for quick and dirty though, but for a real production application I wouldn't risk it
  2. Ryan Farley 1/31/2006 2:10 PM
    Gravatar
    Eber,

    Absolutely. I agree. I would never use this for a production application. The memory consumption by passing images around on the clipboard like that (and these images would be significant in size) also makes this something not to be used in production.

    However, if all you're looking for is a quick way to get a few thumbnails, this worked great.

    Thanks!
    -Ryan
  3. Dave 2/1/2006 7:50 AM
    Gravatar
    There's a Code Project page also for this here:
    http://www.codeproject.com/dotnet/pdfthumbnail.asp

    And for an ASP.Net component that's free:
    http://www.tallcomponents.com/Default.aspx?id=pdfthumbnail

    The CodeProject source might give you more ideas on other ways to do the job.

    Dave ........
  4. Ryan Farley 2/1/2006 10:53 PM
    Gravatar
    Thanks Dave!

    That codeproject article adds a very nice touch of the little bent corner of the page on the top left. That's nice. But it's also good to see that I wasn't the only one with this approach. To be honest I spent a lot of time trying to find some way to avoid using the clipboard (the idea was making me cringe). I kept thinking that I must be missing something else to use to do that, but I guess not since that's the same approach the author of the CP article used.

    Also, thanks a bunch for the tip on the free tall component for pdf thumbnails. That one's a keeper. Good one to use if I ever need to do this again for a production app.

    Thanks.

    -Ryan
  5. Subbaram 6/21/2006 1:46 PM
    Gravatar
    Any idea how to access the form fields in the pdf. I need to populate the fileds from the database. I used TallComponents and was able to do it. I want to do from the Acrobat COM. Thanks in advance
  6. Rahul 12/10/2006 11:12 PM
    Gravatar
    I get an error when i run the application the error says
    Retrieving the COM class factory for component with CLSID {FF76CB60-2E68-101B-B02E-04021C009402} failed due to the following error: 80040154.

    Please help
  7. Ryan Farley 12/11/2006 6:22 AM
    Gravatar
    Rahul,

    Sounds like you don't have Adobe Professional installed (as mentioned in the article it is a requirement to use the code)

    -Ryan
  8. Babin 6/6/2007 3:38 AM
    Gravatar
    I have installed Adobe Professional.
    But still giving error "Retrieving the COM class factory for component with CLSID {FF76CB60-2E68-101B-B02E-04021C009402} failed due to the following error: 80040154"

    Some Where I read that I have to give the permission to ASP.NET user.
    Can anybody help on this regard
  9. Michael 7/19/2007 5:50 AM
    Gravatar
    Ryan,

    Thank you so much. The app works great. Thanks for the code too.
  10. Jose Kurian 7/24/2007 10:33 PM
    Gravatar
    hi I am also getting the error Retrieving the COM class factory for component with CLSID {FF76CB60-2E68-101B-B02E-04021C009402} failed due to the following error: 80040154.

    Can any one help me out telling it is solved?
  11. Md. Golam Rabbani 7/3/2008 10:17 PM
    Gravatar
    Hello

    this is working fine on windows application.But it is throwing error on Web based Application.

    Clipboard.GetDataObject(); is returning null


    can u help me please

    Thanks in Advanced

    Md. Golam Rabbani

  12. Ankita 7/31/2008 3:48 AM
    Gravatar
    I have error for like
    Retrieving the COM class factory for component with CLSID {FF76CB60-2E68-101B-B02E-04021C009402} failed due to the following error: 80040154
    reply me fast
  13. Ryan Farley 7/31/2008 8:32 AM
    Gravatar
    Ankita,

    That error indicates that the COM library with that CLSID is not registered on the machine. Do you have Acrobat PROFESSIONAL installed (note, normal free acrobat will not work for this).

    -Ryan
  14. Md. Golam Rabbani 8/11/2008 1:55 AM
    Gravatar
    doc = (Acrobat.CAcroPDDoc)new Acrobat.AcroPDDocClass();
    i am getting error in the above line...i have installed Acrobat PROFESSIONAL. this is working fine in local machine..but in the server it is throwing error.i am useing ASP.Net 3.5 and Acrobat 8
  15. Swati 8/29/2008 5:00 AM
    Gravatar
    Hi ryan,

    Thank you for explaining this thumbnail generation procedure in details.

    Here is my problem.
    I am having an PDF where the rect.right = 2835;
    rect.bottom = 3969;

    Thumbnail generation fails as it fails to return bitmap object.

    ************
    Bitmap bmp = (System.Drawing.Bitmap)data.GetData(DataFormats.Bitmap);
    **********

    data.GetData doesnt return anything. Do you have an idea if there is any limitation for rect.right and rect.bottom values.

    Swati

    Thanks in advance
  16. Brad 9/11/2008 9:35 PM
    Gravatar
    I have found that the clipboard will not return any thing for two reasons.
    1. If you are running this as a web application. Found many reason on the web due to the fact that code behind can not access clip board content. Also a web app does not use the System.Windows.Forms where the clip board is located. I have read the .NET 3.5 does have a System.Windows.Clipboard.

    2. If you have a worker thread doing the work, it can not access the Main thread where the Clipboard data is located. You much create a delegate that returns and IDataObject and use BeginInvoke and EndInvoke to get the returning clipboard data back from the Main Thread.

    I did find one article explaining how this can be done over the web with web services. Very long, use 2 if not 3 web services to get the job done. But in a nut shell, that example Serialized the clipboard data and pass it back and forth between the web server and client.

    -- Hope this helps --

    Brad
  17. Gabriel 10/8/2008 10:41 AM
    Gravatar
    Thanks Ryan for the code, was very helpfull in a proyect that was doing.
    I made a mixture with a similar proyect in http://www.codeproject.com/KB/GDI-plus/pdfthumbnail.aspx
    ,but the .dll reference of that is too old (in the download from this page, no problem), and the code have some bugs like "stop to render at the amount of 250 thumbs....with no reason" I solved with your code, and for make a entire folder of pdf, just include:

    string[] files = Directory.GetFiles(pdfInputPath, "*.pdf");

    and after read the array whit a for loop :

    for (int n=0; n < files.Length; n++)
    {
    string inputFile = files[n].ToString();

    i make a tons of preview ;-)
    thanks again!!!
  18. Ajith 5/5/2009 12:22 AM
    Gravatar
    Imagemagik has this covered. Works on windows. No COM API, no Acrobat.

    http://blog.prashanthellina.com/2008/02/03/create-pdf-thumbnails-using-imagemagick-on-linux/
  19. JomyK 6/18/2009 12:57 PM
    Gravatar
    Acrobat.CAcroPDDoc pdfDoc = (Acrobat.CAcroPDDoc)new Acrobat.AcroPDDocClass();
    bool ret = pdfDoc.Open("c:\\1_1277.pdf");
    return pdfDoc.GetNumPages();

    I am trying to get the page numbers. It works in local, but in IIS, not workin? any Idea
  20. vijai 10/19/2009 3:45 AM
    Gravatar
    hi,

    I installed the Acrobat. But AcroExch.PDDoc is missing. where can i get this full Acrobat.
  21. Ryan Farley 10/19/2009 7:22 AM
    Gravatar
    @vijai, the AcroExch.PDDoc object will only exist if you install Acrobat Pro. You won't see it if you just install the free version.
  22. anirudha 1/14/2010 3:30 AM
    Gravatar
    I like it but when i run it he says that Adobe acrobat is required
  23. Avatar 1/15/2010 7:40 AM
    Gravatar
    I like it, but I searching for something that generate a pdf's preview by a folder and get the result with json on my website page.
  24. hahaha 2/2/2010 4:37 PM
    Gravatar
    >Retrieving the COM class factory for component with CLSID {FF76CB60-2E68->101B-B02E-04021C009402} failed due to the following error: 80040154

    add web.config
    OK
    <identity impersonate="true" userName="+++" password="+++" />
  25. 7/20/2011 4:13 PM
    Gravatar
    Generating thumbnails | Myorange23supp
Comments have been closed on this topic.



 

News


Also see my CRM Developer blog

Connect:              


Sponsor

Sections