Hello

Hello,

I've updated my PDFReader project to use my own library libPDFium build from scratch with the last PDFium source code !

https://github.com/tothpaul/PDFiumReader


https://github.com/tothpaul/PDFiumReader

Comments

  1. Paul, is the compiled library available for download somewhere?

    ReplyDelete
  2. Paul TOTH Thanks - obviously it was to 'simple' for me :-)

    ReplyDelete
  3. Paul, can we read the content of the PDF ? I mean extract all text or even images from a pdf ?

    ReplyDelete
  4. Stéphane Wierzbicki you can read the text, just use IPDFText; for images I'm not sure...PDFium is not well documented and the C++ code is horrible :)

    ReplyDelete
  5. +Paul TOTH I am happy to hear you are no longer distributing those unvetted pdfium DLL's found in the wild (but now, perhaps you are making new ones).

    The latest? What is that? It changes by the hour, often with some pretty horrible mistakes that get added in. I go though that code everyday trying to decide what is good and what is bad.

    I mean, getting a image placement is marked as a "experimental API" (with the matrix getter just now appearing, and they cant even decide if you are supposed to send a page, a path, or an image as the first parameter).

    "The latest" probably is no longer.

    To be useful, you might consider publishing the commit number used, along with some of the voodoo used to make the DLL, like the calling conventions expected for both the exported functions and what the callbacks functions expect (those are not set in the PDFIum code).

    I would mark the DLL's to make them easy to identify, since some folks actually keep track of the different pdfium dll's found in the wild.

    Just my 2 cents.

    ReplyDelete
  6. Joe C. Hecht I'de like to add a version.rc file to this project, but I didn't find how to to that.

    I didn't see any "commit number" notion if the BUILD process.....for now, I'm just very happy figuring how to build my own DLL with very few steps.

    ReplyDelete
  7. Paul TOTH I salute you for taking it on. It took me a good solid month to get very good controllable windows builds.

    Use a resource compiler to go from .rc to .res and build it in. For marking, you may be better off hard embedding, since the res can easily be changed out after the fact. I looked at the dll, it will be easy enough to identify (at least it hints that it came from your project via what you export).

    I get a lot of customers using PDFium DLLs that they have no clue where the dll came from (and we try to identify them - usually with pretty good results).

    People just use these dlls without thought or consideration of what they are built on or where they came from. Your DLLs will get swiped up and used used elsewhere inn the same way, and end up in production software.

    Its a big problem (and a huge security risk).

    PDFIum is a project that changes often by the hour, and often uses beta versions of 3rd party libs. Getting a build suitable for production is an art (and a bit of a science), taking a full time effort.

    FWIW, I only know of perhaps three build sources that I consider production suitable, and they have large teams working on it. Past that, it's all Willy Nilly, and you probably get what you pay for.

    PDFIum is a huge project, and there are few people on the planet that really understand the workings of a PDF engine.

    Joe

    ReplyDelete
  8. Joe C. Hecht I know how complex it is to deal with products versions.
    In the provided binary DLL I've added the version info myself. The IPDFium interface request a RequiredVersion and it provide a GetVersion method.

    PDFium is huge but it doesn't look like a clean code, for instance in the same source code you can find two different coding style

    pdfium.googlesource.com - fpdfsdk/fpdf_view.cpp - pdfium - Git at Google

    https://pdfium.googlesource.com/pdfium/+/master/fpdfsdk/fpdf_view.cpp#620

    it's not a big deal but it make thinks harder to understand especialy for me, I'm not a C developper :)

    ReplyDelete
  9. Paul TOTH You can certainly tell what was originally there, what is core, and what gets added in half baked. It is a good lib for some things (viewing and JS is pretty good), but I caution its use for a lot of things.

    ReplyDelete
  10. Joe C. Hecht do you know other open source library allowing us to do what pdfium does?

    ReplyDelete
  11. Stéphane Wierzbicki that's an open ended question. You mean free (as in cost) or source available? Pdfium can view, but what else do you want? Its not great for everything. What platform? What compiler? Need a complete pdfium Pascal interface? I'm getting ready to put one out, but is not going to be free (but you will get a certified, maintained, and tested build).

    ReplyDelete
  12. Joe C. Hecht I was thinking of both ot them, free with source code. I only need a library that can extract text or images of a PDF file.

    ReplyDelete
  13. Stéphane Wierzbicki Yes, there are a number of free open source packages to do that. Check your Linux packages.

    ReplyDelete
  14. Paul TOTH I'm not trying to rain on your parade (in fact I salute your effort), but if you are going to be a good open sourceor, you should play by the rules.

    You have a legal and moral responsibility to ship a copyright notice and or license file for every third party source file that requires it for your PDFium binary distribution.

    It's going to take you quite a bit of time to hunt them all down. It is not real obvious, it's a lot more than what is in the download, more than listed in the third party directory, and IMHO those notices are missing quite a bit (some is stripped out).

    I doubt anyone is going to take you to court, but the personal liability of your download is huge, and that liabilty gets spread to everyone distributing your dll.

    You did a great job of protecting your source and taking credit, but what about the work of others that you are building on?

    What about the people that use your project?

    Does your licensing work with the licensing of the other dependencies?

    These questions are very expensive to take on. I had to get a lawyer to advise us.

    The return on investment is why there is (was?) commercial developers willing to support Delphi.

    Many are closing there doors, often enough, due to half baked open source projects like this one.

    I mean no offense by that. Seriously though, you don't know what your doing.

    I know I am going to loose sales to it, and other commercial developers will too.

    How much does that help the Delphi community? How much does it hurt the Delphi community when we close our doors?

    The first rule of "Open Source" is to ask "what must I do to use and distribute it?".

    Good luck with that. There is not a easy, fast, or cheap answer. But you need one if you are going to legitimately use open source.

    Joe

    ReplyDelete

Post a Comment