View Issue Details

IDProjectCategoryView StatusLast Update
0008879Kali LinuxNew Tool Requestspublic2024-10-26 21:26
Reportermccrypter Assigned To 
PrioritynormalSeverityminorReproducibilityhave not tried
Status newResolutionopen 
Summary0008879: pdfalyzer - A PDF analysis tool for visualizing PDF's inner structure + scanning embedded binary streams for malicious content
Description

[Name] pdfalyzer
[Version] 1.14.10
[Homepage] https://github.com/michelcrypt4d4mus/pdfalyzer
[Download] https://pypi.org/project/pdfalyzer/#files
[Author] Michel de Cryptadamus
[Licence] GPLv3

[Description] A PDF analysis tool for visualizing the inner tree-like data structure1 of a PDF in spectacularly large and colorful diagrams as well as scanning the binary streams embedded in the PDF for hidden potentially malicious content.

Many screenshots are visible here: https://github.com/michelcrypt4d4mus/pdfalyzer?tab=readme-ov-file#example-output

[Dependencies] python 3.9+
python = "^3.9"
anytree = "~=2.8"
chardet = ">=5.0.0,<6.0.0"
PyPDF2 = "^2.10"
python-dotenv = "^0.21.0"
rich = "^12.5.1"
rich-argparse-plus = "^0.3.1"
yaralyzer = "^0.9.4" (packaged for Kali here )

[Similar tools] Didier Stevens' PDF tools do a couple things with malicious PDFs but there is no tool that does the visualizations or YARA scanning of embedded binaries like pdfalyzer.

[Activity] Development started mid summer 2022. Actively maintained, very surprising amount of interest and usage since it was open sourced.
Already available as a homebrew package.

[How to install] pipx install pdfalyzer is the easiest way

[How to use] - What are some basic commands/functions to demonstrate it?
See here for example output.
See here for use of the pdfalyzer in malware analysis by an unrelated party.

# Basic scan
pdfalyze lacan_buys_the_dip.pdf

# Dump output to .svg image
pdfalyze -svg -t lacan_buys_the_dip.pdf 

[Packaged] - Is the tool already packaged for Debian?
no

Activities

Arszilla

Arszilla

2024-09-28 15:45

reporter   ~0019844

Last edited: 2024-09-28 15:46

I have been trying to package this, however, because PyPDF2 is actually pypdf (i.e., https://pypi.org/project/PyPDF2/ points to https://github.com/py-pdf/pypdf for its source code, which is in Debian's repositories as python3-pypdf), it'd be appreciated it if you can update your dependencies to reflect this change, so that I can draft the package and share it.

mccrypter

mccrypter

2024-10-26 21:26

reporter   ~0019978

finally got around to fixing this; was slightly non-trivial to do the pypdf version jump. anyways just released pdfalyzer version 1.16.1.
CHANGELOG: https://github.com/michelcrypt4d4mus/pdfalyzer/blob/master/CHANGELOG.md#1160

Issue History

Date Modified Username Field Change
2024-08-19 19:50 mccrypter New Issue
2024-09-28 15:45 Arszilla Note Added: 0019844
2024-09-28 15:46 Arszilla Note Edited: 0019844
2024-10-22 20:17 daniruiz Summary [pdfalyzer] A PDF analysis tool for visualizing PDF's inner structure + scanning embedded binary streams for malicious content => pdfalyzer - A PDF analysis tool for visualizing PDF's inner structure + scanning embedded binary streams for malicious content
2024-10-26 21:26 mccrypter Note Added: 0019978