Clean Static Reports Script
This script removes unnecessary dynamic functionality from HTML report files while preserving all static page functionality.
What it removes (SAFE for static sites):
- ✅ AdGuard browser extension scripts (injected by browser)
- ✅ CodeMirror editor functionality (only needed for live editing)
- ✅ Server API interaction scripts (editing, polling, cache management, report generation)
- ✅ Dynamic content modification capabilities
- ✅ Export/share functionality that requires server interaction
- ✅ Theme toggle functionality (locks to single theme)
- ✅ Social media sharing buttons
- ✅ Histogram selection checkboxes for export
- ✅ Debug logging utilities
- ✅ Regenerate/recalculate buttons that require server interaction
- ✅ Edit buttons for titles, reports, summaries
- ✅ Empty comment blocks and unused CSS
What it preserves (ESSENTIAL for static functionality):
- ✅ All static content and styling
- ✅ Show more/less functionality for reports, threads, and histograms
- ✅ Histogram display and behavior
- ✅ Tooltip functionality for topic descriptions
- ✅ Theme styling (CSS)
- ✅ Time conversion utilities
- ✅ Responsive design and layout
Usage
Basic usage (cleans files in-place):
python clean_static_reports.py instructor_reports/*.html
With backup (recommended):
python clean_static_reports.py --backup instructor_reports/*.html
Output to different directory:
python clean_static_reports.py --output-dir cleaned_reports instructor_reports/*.html
Dry run (see what would be done without making changes):
python clean_static_reports.py --dry-run instructor_reports/*.html
Clean all HTML files in multiple directories:
python clean_static_reports.py --backup instructor_reports/*.html html/*.html
Command-line options
--backup or -b: Create .bak backup files before cleaning
--output-dir DIR or -o DIR: Output cleaned files to specified directory
--dry-run or -n: Show what would be done without making changes
Example output
Processing: instructor_reports/gender_tolerance.html
Created backup: instructor_reports/gender_tolerance.html.bak
Removing AdGuard script: //local.adguard.org?ts=1750306983708&type=content-script...
Removing AdGuard script: //local.adguard.org?ts=1750306983708&name=AdGuard%20Extra...
Removing CodeMirror script: https://cdnjs.cloudflare.com/ajax/libs/codemirror/5.65.5/codemirror.min.js
Removing CodeMirror CSS: https://cdnjs.cloudflare.com/ajax/libs/codemirror/5.65.5/codemirror.min.css
Removing histogram-title-fixer script
Removing API interaction script (2046 chars)
Removing API interaction script (21295 chars)
✓ Cleaned successfully
Size reduction: 130,505 bytes (19.8%)
Output: instructor_reports/gender_tolerance.html
Completed: 1 successful, 0 failed
Safety features
The script is designed to be very conservative and includes multiple safety checks:
- Pattern matching: Only removes scripts with specific API endpoint patterns
- Essential functionality check: Preserves scripts that contain critical functionality keywords
- Backup option: Always create backups before making changes
- Dry run mode: Test what would be changed without making modifications
- Conservative approach: When in doubt, the script keeps the code
Requirements
- Python 3.6+
- BeautifulSoup4:
pip install beautifulsoup4
Typical size reduction
Expect 18-22% file size reduction while maintaining 100% of user-visible functionality.
Actual results from processing 16 instructor report files:
- abortion_debate.html: 21.0% reduction (49,898 bytes)
- affirm_boys_men.html: 18.8% reduction (148,193 bytes)
- alcoholics_liver.html: 19.6% reduction (55,949 bytes)
- animal_suffering.html: 20.4% reduction (55,881 bytes)
- deaths_despair.html: 20.7% reduction (52,222 bytes)
- enhancement_ethics.html: 21.2% reduction (46,115 bytes)
- euthanasia_mental.html: 20.4% reduction (54,876 bytes)
- gender_dialogues.html: 19.2% reduction (80,944 bytes)
- group_work.html: 19.0% reduction (79,122 bytes)
- invol_commit_housing.html: 20.2% reduction (56,502 bytes)
- med_paternalism.html: 20.1% reduction (58,015 bytes)
- moral_agency.html: 19.3% reduction (86,954 bytes)
- organ_markets.html: 20.3% reduction (54,732 bytes)
- species_ethics.html: 17.9% reduction (80,932 bytes)
- univ_healthcare.html: 20.1% reduction (51,212 bytes)
Average reduction: 19.8% with 100% functionality preservation.