Relatively sane conversion of PDFs to web-ready JPGs using ImageMagick.

September 15th, 2011

Some people when confronted with a problem, think “I know,
I’ll use ImageMagick.” Now they have two problems.
*

For one of the sites I’m maintaining a lot of content is generated directly from (more or less print-ready) PDFs. The only free tool I’ve been able to find that can convert PDFs to decent quality JPGs or PNGs is ImageMagick.

But even when you’ve got ImageMagick’s convert and mogrify commands installed, conversion of PDFs still requires some careful tuning, that is: careful selection of arguments to convert. Also; a sacrificial chicken and lots of patience. Anyway, here’s what I ended up with. Most of this is also available in my clj-imajine clojure library.

Color space.

Many web browsers do not support any color space other than RGB/sRGB. If your PDFs are in the CMYK color space (usual for print) or any other color space, the resulting JPGs will look “weird” in many applications and web browsers; some viewers just show a blank image and others completely mess up the colors. To make sure the end result is in sRGB, use the option “-colorspace sRGB“.

Color depth.

For much the same reasons, you want to enforce that the output color depth is 8 bits for JPGs. To do that, use the option “-depth 8“.

Crop boxes.

PDFs are pretty complex documents and one potential pitfall is that there are at least 3 different indicators of the “boundaries” of the PDF. I’ve run into a few where the “right” boundaries were provided by the “cropbox” instead of the “media box”. This post by Joseph Scott provided the solution: use “-define pdf:use-cropbox=true“.

The final line becomes:

convert -define pdf:use-cropbox=true -colorspace sRGB -depth 8 pages.pdf pages.jpg

Note that if your PDF contains more than one page, this will generate a JPG for each one, named pages1.jpg, pages2.jpg etc… To select a single page you can use convert -define pdf:use-cropbox=true -colorspace sRGB -depth 8 pages.pdf[X] pages.jpg where X is the page number minus 1. You can find the page numbers in a PDF using ImageMagick’s identify command like this: identify -density 2 -format "%p," pages.pdf

*) paraphrased from Jamie Zawinski’s remark on regular expressions.

Announcement: pretzel – clojure predicate functions

April 5th, 2011

I’m working on pretzel right now. It’s a basic library that can be used to combine predicates and also holds a bunch of tests on string content.

Code and documentation is on github.

Announcement: flutter-decline-demo – validation and form generation on compojure

April 2nd, 2011

Some people indicated they wanted some example code for my clj-decline
(validation) and flutter (form generation) libraries. So today I wrote
a simple demo application that uses both.

Get the code at github.

Announcement: flutter – clojure / hiccup form fields

March 26th, 2011

I’ve been working on flutter, a library for saner form generation today. First (pretty basic) release was done on clojars. Still working on many details, including more-or-less full-coverage tests.

Get the code on github.

Also see the announcement on the clojure group.

Announcement: ring-persistent-cookies

March 12th, 2011

Just released a minimal library to generate persistent cookies for ring.middleware.cookies.

It’s at github and clojars.org.

Announcement: clj-decline – validation sucks.

March 12th, 2011

I pushed a new validation library for clojure to github yesterday. Check out clj-decline.

Why another validation library?

Well, why does validation suck so much?

Of course, dealing with user input is annoying anyway. But validation libraries always seem to want to do things in just the wrong way for the project you’re working on.

Let me count the ways:

  • They do too much. Session management? XSS detection? Javascript validation? Form definitions? Bollocks. It will never work with the code I’ve already got running in production and I’m not going to drag in all of it just to get the core validation functions, assuming I can even use them in my app.
  • Also, validation does not mean “force this input into another type”. That’s not validation. Stick it somewhere else.
  • They assume too much. If you can only validate a single value in a map, you’re useless. I need to check if a frobniz has either two wheebles or an odd number of crinks, and don’t try to stop me.
  • I might need to validate something that isn’t a map. Maybe I want to check two maps. Maybe I need to check a single string or a file upload.
  • They still assume too much. This may come as a shock, but not everyone speaks English on this planet. I have to support multiple languages in the same web app. Give me more options than pre-defined strings for errors, you lousy piece of American imperialist software! People of the earth, throw off your shackles and your “reality” TV shows!

    Ahem.

  • Macros, macros, macros all over the place. Yes, macros are cool, but no, I don’t want to stick every validation in a named var. If I wanted that, I could (def some-name (make-validation …)) so I don’t need your macro anyway. I want to use closures that validate for this specific user and now you’ve stopped me.
  • Don’t be passive agressive. Validation is a user-centric feature. If I wanted to tell the user only their first error, I’d use exceptions. Don’t force the user to submit their form twenty times until they’ve fixed all their mistakes. Give them as much information as possible so they know what’s going on.

    So, how does clj-decline fix all that?

    It doesn’t. It just stays away from most of the above. clj-decline is simple. It validates arguments and returns errors. Everything else is up to the user or some other library. It’s completely functional, has no macros, no built-in predicates, nothing binds it to a web framework or anything else, and errors / messages can be anything you like. The only decision I made is that errors are grouped by key (which can also be anything you like).

    Enjoy!

    Joost.

Talk @ amsterdam-clojurians, Wednesday March 9, 2011

March 7th, 2011

 ____________________ 
< Functional Clojure >
 -------------------- 
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

I’ll be doing a short presentation on the basic higher-order (sequence) functions in clojure.core at the Amsterdam Clojurians meeting next Wednesday. This talk should be understandable and useful for Clojure newbies. If you’re interested, just show up around 7.

Update: quite a few people showed up to the meeting. I’ve put the slides on github, and here is the elisp code I used to present the slides.

SLIME hints #5 – slime-apropos

February 10th, 2011

This is part of the series on SLIME functions. See the introduction for information on what SLIME is.

Another very short post.

Now what is the name of that function again? Which namespace contains that variable?

Call slime-apropos; Default key-binding: C-c C-d C-a and type the part of the name you remember and SLIME will list all matching globals/symbols in the running program.

Announcement: ring-upload-progress

September 18th, 2010

I’ve forked off ring.middleware.multipart-params into a new library called ring.middleware.upload-progress

It’s a bit rough-and-ready for now, and it uses the session to store the shared state about current uploads, which is probably not the best way to go about it – I’m thinking about introducing a new lower-level mechanism for low-memory shared state based on clojure’s STM tools, since this kind of thing is interesting for more than just uploads – but it does work as long as you’re not doing anything too fancy (if you’re worried, you can use a separate session store just for this info).

For the interested: the code is at github.

I’ve got some working “upload progress bar” javascript and routing code too, but it’s not in this project, so for now, you’ll have to roll your own.

I’ll get a clojars release done as soon as I’m satisfied this stuff is usable. Probably in the next few days.

Announcement: clj-sql and clj-imajine

September 15th, 2010

I just released a few bits of image code that might be useful to more people. clj-imajine can read, write and scale images. There’s also some minimal experimental pdf support. More features will be added when I need them or if someone else sends me a good pull request.

Source code is on github. Leiningen/maven jars are on clojars.

Also, Saul Hazledine got in touch with me to merge our changes of clojure.contrib.sql and make it an official project. It’s called clj-sql and differs from clojure.contrib.sql mostly in that it makes it possible to use :keywords-with-dashes as column names (for most operations, at least, so far) and that its support for auto-generated keys is much better, i.e. there is some!

Source code is on github. Leiningen/maven jars are on clojars.