Awesome Skills
ExploreCategoriesTrending
Awesome Skills

Discover, compare, and use the best Agent Skills for Claude and AI assistants.

Product

  • Explore Skills
  • Categories
  • Trending
  • Collections

Resources

  • Skills Docs
  • Skill Document
  • Official Skills
  • Codex Skills Docs
  • Agent Skills Spec

© 2026 Awesome Skills. All rights reserved.

Document ProcessingVerifiedFeatured

pdf

Comprehensive PDF manipulation toolkit for extracting text and tables, creating new PDFs, merging/splitting documents, and handling forms. When Claude needs to fill in a PDF form or programmatically process, generate, or analyze PDF documents at scale.

0.0(0 reviews)
34.5k stars
3.1k forks
24.3k views
claudepdf
View on GitHub
Installation
Quick Install

Run this command in Claude Code to install the skill

>/install https://github.com/anthropics/skills/tree/main/skills/pdf
Manual Installation

Install to your personal skills directory (~/.claude/skills/pdf/)

# Create skill directory
mkdir -p ~/.claude/skills/pdf

# Download SKILL.md from GitHub
curl -sL "https://raw.githubusercontent.com/anthropics/skills/main/skills/pdf/SKILL.md" \
  -o ~/.claude/skills/pdf/SKILL.md

Target: ~/.claude/skills/pdf/

Information
Authoranthropics
UpdatedJan 12, 2026
Category
F

Document Processing

Skills for working with documents, PDFs, spreadsheets, and presentations

Related Skills

Document Processing

pptx

Presentation creation, editing, and analysis. When Claude needs to work with presentations (.pptx files) for: (1) Creating new presentations, (2) Modifying or editing content, (3) Working with layouts, (4) Adding comments or speaker notes, or any other presentation tasks

claudeanalysis
34.5k
3.1k
15.2k
Document Processing

docx

Comprehensive document creation, editing, and analysis with support for tracked changes, comments, formatting preservation, and text extraction. When Claude needs to work with professional documents (.docx files) for: (1) Creating new documents, (2) Modifying or editing content, (3) Working with tracked changes, (4) Adding comments, or any other document tasks

claudeanalysis
34.5k
3.1k
18.5k
Document Processing

xlsx

Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. When Claude needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modify existing spreadsheets while preserving formulas, (4) Data analysis and visualization in spreadsheets, or (5) Recalculating formulas

claudecsvanalysis+1
34.5k
3.1k
21.4k
Document Processing

Markdown to EPUB Converter

Converts markdown documents and chat summaries into professional EPUB ebook files

markdown
44
3
3.2k
Skill Instructions
View Raw

This is the SKILL.md content that Claude uses to understand this skill.

PDF Processing Guide

Overview

This guide covers essential PDF processing operations using Python libraries and command-line tools. For advanced features, JavaScript libraries, and detailed examples, see reference.md. If you need to fill out a PDF form, read forms.md and follow its instructions.

Quick Start

python
from pypdf import PdfReader, PdfWriter # Read a PDF reader = PdfReader("document.pdf") print(f"Pages: {len(reader.pages)}") # Extract text text = "" for page in reader.pages: text += page.extract_text()

Python Libraries

pypdf - Basic Operations

Merge PDFs

python
from pypdf import PdfWriter, PdfReader writer = PdfWriter() for pdf_file in ["doc1.pdf", "doc2.pdf", "doc3.pdf"]: reader = PdfReader(pdf_file) for page in reader.pages: writer.add_page(page) with open("merged.pdf", "wb") as output: writer.write(output)

Split PDF

python
reader = PdfReader("input.pdf") for i, page in enumerate(reader.pages): writer = PdfWriter() writer.add_page(page) with open(f"page_{i+1}.pdf", "wb") as output: writer.write(output)

Extract Metadata

python
reader = PdfReader("document.pdf") meta = reader.metadata print(f"Title: {meta.title}") print(f"Author: {meta.author}") print(f"Subject: {meta.subject}") print(f"Creator: {meta.creator}")

Rotate Pages

python
reader = PdfReader("input.pdf") writer = PdfWriter() page = reader.pages[0] page.rotate(90) # Rotate 90 degrees clockwise writer.add_page(page) with open("rotated.pdf", "wb") as output: writer.write(output)

pdfplumber - Text and Table Extraction

Extract Text with Layout

python
import pdfplumber with pdfplumber.open("document.pdf") as pdf: for page in pdf.pages: text = page.extract_text() print(text)

Extract Tables

python
with pdfplumber.open("document.pdf") as pdf: for i, page in enumerate(pdf.pages): tables = page.extract_tables() for j, table in enumerate(tables): print(f"Table {j+1} on page {i+1}:") for row in table: print(row)

Advanced Table Extraction

python
import pandas as pd with pdfplumber.open("document.pdf") as pdf: all_tables = [] for page in pdf.pages: tables = page.extract_tables() for table in tables: if table: # Check if table is not empty df = pd.DataFrame(table[1:], columns=table[0]) all_tables.append(df) # Combine all tables if all_tables: combined_df = pd.concat(all_tables, ignore_index=True) combined_df.to_excel("extracted_tables.xlsx", index=False)

reportlab - Create PDFs

Basic PDF Creation

python
from reportlab.lib.pagesizes import letter from reportlab.pdfgen import canvas c = canvas.Canvas("hello.pdf", pagesize=letter) width, height = letter # Add text c.drawString(100, height - 100, "Hello World!") c.drawString(100, height - 120, "This is a PDF created with reportlab") # Add a line c.line(100, height - 140, 400, height - 140) # Save c.save()

Create PDF with Multiple Pages

python
from reportlab.lib.pagesizes import letter from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer, PageBreak from reportlab.lib.styles import getSampleStyleSheet doc = SimpleDocTemplate("report.pdf", pagesize=letter) styles = getSampleStyleSheet() story = [] # Add content title = Paragraph("Report Title", styles['Title']) story.append(title) story.append(Spacer(1, 12)) body = Paragraph("This is the body of the report. " * 20, styles['Normal']) story.append(body) story.append(PageBreak()) # Page 2 story.append(Paragraph("Page 2", styles['Heading1'])) story.append(Paragraph("Content for page 2", styles['Normal'])) # Build PDF doc.build(story)

Command-Line Tools

pdftotext (poppler-utils)

bash
# Extract text pdftotext input.pdf output.txt # Extract text preserving layout pdftotext -layout input.pdf output.txt # Extract specific pages pdftotext -f 1 -l 5 input.pdf output.txt # Pages 1-5

qpdf

bash
# Merge PDFs qpdf --empty --pages file1.pdf file2.pdf -- merged.pdf # Split pages qpdf input.pdf --pages . 1-5 -- pages1-5.pdf qpdf input.pdf --pages . 6-10 -- pages6-10.pdf # Rotate pages qpdf input.pdf output.pdf --rotate=+90:1 # Rotate page 1 by 90 degrees # Remove password qpdf --password=mypassword --decrypt encrypted.pdf decrypted.pdf

pdftk (if available)

bash
# Merge pdftk file1.pdf file2.pdf cat output merged.pdf # Split pdftk input.pdf burst # Rotate pdftk input.pdf rotate 1east output rotated.pdf

Common Tasks

Extract Text from Scanned PDFs

python
# Requires: pip install pytesseract pdf2image import pytesseract from pdf2image import convert_from_path # Convert PDF to images images = convert_from_path('scanned.pdf') # OCR each page text = "" for i, image in enumerate(images): text += f"Page {i+1}:\n" text += pytesseract.image_to_string(image) text += "\n\n" print(text)

Add Watermark

python
from pypdf import PdfReader, PdfWriter # Create watermark (or load existing) watermark = PdfReader("watermark.pdf").pages[0] # Apply to all pages reader = PdfReader("document.pdf") writer = PdfWriter() for page in reader.pages: page.merge_page(watermark) writer.add_page(page) with open("watermarked.pdf", "wb") as output: writer.write(output)

Extract Images

bash
# Using pdfimages (poppler-utils) pdfimages -j input.pdf output_prefix # This extracts all images as output_prefix-000.jpg, output_prefix-001.jpg, etc.

Password Protection

python
from pypdf import PdfReader, PdfWriter reader = PdfReader("input.pdf") writer = PdfWriter() for page in reader.pages: writer.add_page(page) # Add password writer.encrypt("userpassword", "ownerpassword") with open("encrypted.pdf", "wb") as output: writer.write(output)

Quick Reference

TaskBest ToolCommand/Code
Merge PDFspypdf
writer.add_page(page)
Split PDFspypdfOne page per file
Extract textpdfplumber
page.extract_text()
Extract tablespdfplumber
page.extract_tables()
Create PDFsreportlabCanvas or Platypus
Command line mergeqpdf
qpdf --empty --pages ...
OCR scanned PDFspytesseractConvert to image first
Fill PDF formspdf-lib or pypdf (see forms.md)See forms.md

Next Steps

  • For advanced pypdfium2 usage, see reference.md
  • For JavaScript libraries (pdf-lib), see reference.md
  • If you need to fill out a PDF form, follow the instructions in forms.md
  • For troubleshooting guides, see reference.md