
Make-pdf-javascript.py allows one to create a simple PDF document with embedded JavaScript that will execute upon opening of the PDF document. The type is a Name and as such is case-sensitive and must start with a slash-character (/). Type allows you to select all objects of a given type.

Reference allows you to select all objects referencing the specified indirect object. If more than one object have the same ID (disregarding the version), all these objects will be outputted. Objects outputs the data of the indirect object which ID was specified. not the printable Python representation). The raw option makes pdf-parser output raw data (e.g. For the moment, only FlateDecode is supported (e.g. The search is not case-sensitive, and is susceptible to the obfuscation techniques I documented (as I’ve yet to encounter these obfuscation techniques in the wild, I decided no to resort to canonicalization).įilter option applies the filter(s) to the stream. The search option searches for a string in indirect objects (not inside the stream of indirect objects). For example, I generated statistics for 2 malicious PDF files, and although they were very different in content and size, the statistics were identical, proving that they used the same attack vector and shared the same origin. Use this to identify PDF documents with unusual/unexpected objects, or to classify PDF documents. The stats option display statistics of the objects found in the PDF document. You can see the parser in action in this screencast. The code of the parser is quick-and-dirty, I’m not recommending this as text book case for PDF parsers, but it gets the job done.


This tool will parse a PDF document to identify the fundamental elements used in the analyzed file.
#INSTALL PDFINFO HOW TO#
Here is a set of free YouTube videos showing how to use my tools: Malicious PDF Analysis Workshop.
