SwayReports

This repository contains a collection of Sway instructor reports and a showcase website to display them. The reports are HTML files with data visualizations and instructor summaries of class discussions.

Repository Structure

Scripts

The easiest way to manage your reports is with the interactive shell script:

./update_reports.sh

This will show a menu with options to:

  1. Update showcase (preserves all your manual work)
  2. Update showcase AND refresh existing reports
  3. Recover from backup
  4. Exit

update_showcase.py

This script updates the showcase webpage with the latest reports from the instructor_reports directory.

# Regular update (preserves ALL your manual work)
python update_showcase.py

# Refresh titles, descriptions, and categories from HTML files
python update_showcase.py --refresh-existing

By default, this script now preserves ALL your manual work, including:

See Update Showcase Guide for detailed documentation.

preprocess_reports.py

This script preprocesses the HTML files by removing the Sway header and social sharing sections. It creates backups of the original files in the instructor_reports_backup directory.

python preprocess_reports.py

recover_showcase_data.py

If you ever lose your showcase data or need to restore from a backup:

python recover_showcase_data.py

This will show all available backups and let you choose which one to recover from.

Requirements

The scripts require Python 3.6+ and the following packages:

Install the required package with:

pip install beautifulsoup4

Usage

  1. Place the instructor report HTML files in the instructor_reports directory
  2. Run the main script:
    ./update_reports.sh
    
  3. Open instructor_reports_showcase.html in a web browser to view the showcase

Modifying the Showcase

The showcase webpage (instructor_reports_showcase.html) includes:

If you need to modify the showcase design, you can edit the CSS styles in the instructor_reports_showcase.html file.

Documentation

All documentation is in the docs/ folder:

Document Description
SEO & Standalone Navigation How the Google Sites sidebar/banner replica works for direct-access pages (architecture, layout CSS, scripts, bugs)
Branding Guide Logo, colors, typography, tone of voice
Architecture HTML generation, BeautifulSoup entity handling
Update Showcase Guide Using update_showcase.py and its options
Category Manager Flask web app for managing report categories
Regenerate Reports Dashboard URLs for regenerating/exporting the 32 instructor reports
Instructor Report URLs Directory of 72 report URLs extracted from emails
Thread Readability Report QA audit of thread showcase and student threads
Comprehensive Review Overview of the Sway platform, navigation, and key pages
Clean Static Reports Script docs for stripping dynamic functionality from HTML reports
Histogram Sorting Completion report for alphabetical histogram sorting across 32 reports

Research reports live with their code in convergence_figures/.

Notes

Sway Reports PDF Converter

A tool to convert HTML instructor reports to paginated PDFs while preserving exact appearance.

Features

Instructor Reports Showcase

The instructor_reports_showcase.html page provides a polished, searchable, and filterable web showcase of instructor reports generated by Sway. It is modeled after the student feedback showcase and is suitable for publishing on Github Pages or similar static hosting. Each card displays the assignment title, a short description, and a link to the full report. The site supports dark/light mode, responsive design, and category filtering for easy browsing—even with dozens of reports.

To use:

Installation

# Clone the repository
git clone <your-repository-url>
cd SwayReports

# Install dependencies
npm install

Usage

Basic Usage

# Convert a single HTML file
node htmlToPdf.js --input html/Report.html

# Convert all HTML files in a directory
node htmlToPdf.js --input html/

Available Options

Options:
  --input, -i     Input HTML file or directory                [string] [required]
  --output, -o    Output PDF directory                     [string] [default: "pdfs"]
  --format, -f    Page format                               [string] [default: "A4"]
  --margin, -m    Margins in CSS format (e.g. "1cm")       [string] [default: "1cm"]
  --scale, -s     Scale factor for the page               [number] [default: 1.0]
  --header, -h    Include header                          [boolean] [default: true]
  --footer        Include footer with page numbers        [boolean] [default: true]
  --help          Show help                                                [boolean]

Examples

# Convert a single file with custom options
node htmlToPdf.js --input html/Report.html --output my-pdfs --format Letter --margin 0.5in --scale 0.9

# Convert all HTML files without headers and footers
node htmlToPdf.js --input html/ --header false --footer false

# Use npm script
npm run convert -- --input html/

How It Works

This tool uses Playwright to render HTML pages exactly as they would appear in a browser, then applies intelligent pagination using CSS print styles. The pagination logic:

  1. Prevents page breaks within images, tables, and figures
  2. Avoids orphaned or widowed text (prevents single lines at top/bottom of pages)
  3. Keeps headings with their following content
  4. Preserves all styling, colors, and backgrounds

Requirements

Category Manager

The Category Manager is a simple Flask web application for managing categories in the Instructor Reports Showcase. It provides a user-friendly interface to:

Usage

  1. Install the required dependencies:
    pip install -r requirements.txt
    
  2. Run the Category Manager:
    python category_manager.py
    
  3. Open your browser and navigate to:
    http://localhost:5000
    
  4. Use the interface to manage categories and assign them to reports.

For more details, see Category Manager Documentation.

HTML Entity Handling

Issue: Double-escaped ampersands

We identified an issue where ampersands (&) in report titles and descriptions were being displayed as &amp; in the browser. This happened because the HTML entities were being double-escaped during the process of updating the showcase file.

For example, a title like “Circumcision, Parental Leave & Other Topics” was displayed as “Circumcision, Parental Leave & Other Topics” in the browser.

Solution

We implemented the following changes to fix this issue:

  1. Modified both category_manager.py and update_showcase.py to handle HTML entities properly using a specialized approach with BeautifulSoup.

  2. Created a cleanup script fix_showcase_ampersands.py to fix any existing double-escaped ampersands in the showcase file.

  3. Added a test file test_html_escaping.py to verify our solution works.

Technical Details

The main fix involves changing how we add text content to BeautifulSoup elements:

# Instead of this (causes escaping issues):
title_div.string = report_title

# We now do this:
title_div.clear()
title_html = BeautifulSoup(f"<span>{report_title}</span>", 'html.parser')
title_div.append(title_html.span.contents[0])

This approach correctly maintains HTML entity representation without double-escaping.

Scripts