Extracting and Editing PDF file content - Part 2

0800 156 0777

Extracting and Editing PDF file content - Part 2

In the previous post we looked at extracting raster (pixel) based content from PDFs. Today we’ll cover extracting vector content and look at the editing tools for in Acrobat for ‘round-trip’ editing of PDF content.

So, we’ve got our PDF (below) and we need to extract the logo. Since the logo is vector based we want to preserve the vector information so Photoshop isn’t an option as this will rasterize any content we try to extract. Although it’s possible to create and edit vector content in InDesign there aren’t any options to open vector or PDF files directly as these would need to be placed as a linked file. Our weapon of choice here is Illustrator and there a couple of routes to gaining access to the content in a PDF.

 

extracting vector content from a pdf

Firstly, let’s try the direct approach and open the PDF in Illustrator. Open up Illustrator, go to File>Open... and select your PDF in the file browser.

There’s a good chance you’ll be presented with a dialogue box warning you that certain PDF content has been converted/reinterpreted – most often this will involve fonts being outlined to preserve the original appearance of text that uses fonts not available on your system.

 

illustrator pdf dialog box

Despite any conversions that need to take place, in most cases Illustrator does an excellent job of accessing PDF content. The way Illustrator files are structured, and how vector and raster content are handled, is more closely aligned to Postscript and PDF so we’re able to access the file as raw content.

In the Layer panel (below) we can see all the document content as separate sub-layers. Whilst some of it may be grouped within Clipping Paths and Clip Groups every element can be accessed.

 

illustrator pdf layers

From here it’s pretty straightforward accessing the logo. Since the whole page is contained within a primary Clip Group the main Selection tool will just select everything. However, if we click on the body of the logo with the Direct Selection tool we can see which sub clip group is targeted in the layer panel making it easier to zero in on our logo. Notice that not all of the elements that make up the logo are currently selected – although this may have been grouped in the original file the Compound Path is missing the 2 ellipses that form the centre of the O’s. Once these are selected we can easily copy the logo and paste to a new document or move these to a new layer and remove the unwanted content. Our logo is then ready to be saved out as a separate file and used elsewhere.

 

illustrator pdf clipping group in layers

Now let’s look at accessing the logo from within Acrobat. We can call up the “Edit PDF’ tools either by clicking on the sidebar or going to Edit>Edit Text & Images. With the ‘Edit’ tool active we can select images, text frames and vector content and edit it within the PDF or chose to open and edit the content elsewhere.

If we click to select the logo the background image is initially selected despite being behind our logo. If we click again the main body of the logo becomes active – now we can shift-click to include the 2 ellipses from the O characters. Alternatively, if we hold down shift to begin with we can click directly into the logo body without selecting the background image first (PDF editing in all it’s weird and wonderful glory).

Now, with the logo selected, we can choose “Edit Using...” (also available via right-click) and choose Adobe Illustrator.

 

editing a pdf in Acrobat

You’ll probably get a warning dialogue about object appearances changing after editing, but you only live once so you go ahead and click “Yes”.

The file that opens is a temporary document extracted from, and linked to, our PDF – notice the odd “Acr4000...” filename.

 

breaking down the pdf in illustrator

At this point if I simply want to extract the logo I can save this file out under a new filename and I’m good to go. However, if I make an edit to the content and save the temporary file the master PDF file will update to reflect those changes – useful for making minor edits when the original source file isn’t available.

Edit the logo colour and save the file as below.

 

working with pdf in illustrator

When we drop back into Acrobat our PDF should update with the new colour.

 

pdf edited in illustrator back in Acrobat

And the process is much the same with raster content. Select the image in Acrobat and choose Edit Using... Photoshop to open a temporary file – edit, save, update. Useful for minor retouches or colour amends, and an alternate method for extracting images in addition to the process outlined in Part 1.

So, whether you need to extract or edit PDF content, the combination of Acrobat, Photoshop and Illustrator gives you all the tools you’ll need.