feat: add SHA-256 integrity verification for backup and restore#23
feat: add SHA-256 integrity verification for backup and restore#23
Conversation
Compute a content checksum at backup time and store it in the .info file. On restore (both --undo and --restore), recompute and compare to detect corruption or tampering while in trash. Backwards compatible with old .info files that lack a checksum line.
There was a problem hiding this comment.
Pull request overview
Adds SHA-256 integrity verification to rm-safely by recording a checksum at “backup” (trash/move) time and validating it on restore/undo, with new end-to-end tests covering normal restores, tampering, and backward compatibility with older .info files.
Changes:
- Write
sha256:<digest>into the trash.infometadata file during delete/move-to-trash. - Verify stored checksum during
--undoand--restore, failing restore if content was modified in trash. - Extend E2E coverage with integrity verification scenarios (normal, tampered, and old-format
.info).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
rm-safely |
Adds checksum computation/storage and checksum verification on restore/undo; updates .info parsing to read only line 1 as original path. |
tests/test_e2e.sh |
Adds an “Integrity Verification” scenario validating checksum presence, successful restore, tamper detection, and backward compatibility. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| compute_checksum() { | ||
| local target="$1" | ||
| if [ -f "$target" ] && [ ! -L "$target" ]; then | ||
| sha256sum "$target" | cut -d' ' -f1 | ||
| elif [ -L "$target" ]; then | ||
| readlink "$target" | sha256sum | cut -d' ' -f1 | ||
| elif [ -d "$target" ]; then | ||
| (cd "$target" && find . -type f -print0 2>/dev/null | sort -z | xargs -0 sha256sum 2>/dev/null | sha256sum | cut -d' ' -f1) | ||
| fi |
There was a problem hiding this comment.
compute_checksum relies on sha256sum (and for directories also sort -z/xargs -0). On macOS these commands are often unavailable by default, which will either emit "command not found" during backup or make restores fail when a checksum exists but the platform can’t compute the actual hash. Consider adding a small wrapper that selects an available SHA-256 implementation (e.g., sha256sum vs shasum -a 256) and gracefully skips checksum generation/verification when none is available.
| elif [ -L "$target" ]; then | ||
| readlink "$target" | sha256sum | cut -d' ' -f1 | ||
| elif [ -d "$target" ]; then | ||
| (cd "$target" && find . -type f -print0 2>/dev/null | sort -z | xargs -0 sha256sum 2>/dev/null | sha256sum | cut -d' ' -f1) |
There was a problem hiding this comment.
Directory checksums are computed only over regular files (find . -type f). This means symlinks inside directories (and other non-regular entries) are not covered by the integrity check, so a trashed directory could be modified without detection. If directory integrity is a goal, include symlinks (and possibly empty dirs) in the checksum input, or document the limitation.
| (cd "$target" && find . -type f -print0 2>/dev/null | sort -z | xargs -0 sha256sum 2>/dev/null | sha256sum | cut -d' ' -f1) | |
| ( | |
| cd "$target" || exit 0 | |
| # Build a deterministic representation of the directory tree, | |
| # including regular files, symlinks and directories (even empty). | |
| while IFS= read -r -d '' entry; do | |
| # Strip leading "./" for stability | |
| local rel="${entry#./}" | |
| if [ -f "$entry" ] && [ ! -L "$entry" ]; then | |
| local fh | |
| fh=$(sha256sum "$entry" | cut -d' ' -f1) | |
| printf 'F %s %s\n' "$rel" "$fh" | |
| elif [ -L "$entry" ]; then | |
| local target_path | |
| target_path=$(readlink "$entry") | |
| printf 'L %s %s\n' "$rel" "$target_path" | |
| elif [ -d "$entry" ]; then | |
| printf 'D %s\n' "$rel" | |
| fi | |
| done < <(find . -mindepth 1 -print0 2>/dev/null | sort -z) | |
| ) | sha256sum | cut -d' ' -f1 |
| fi | ||
| # Clean up tampered file from trash and undo stack entry | ||
| /bin/rm -f "$trash_file" "$trash_dir/info/${trash_name}.info" 2>/dev/null | ||
| sed -i.bak '$ d' "$UNDO_STACK" 2>/dev/null; /bin/rm -f "${UNDO_STACK}.bak" |
There was a problem hiding this comment.
This cleanup step can terminate the whole e2e script because the file runs with set -e: if sed -i.bak fails (e.g., UNDO_STACK missing/empty or sed incompatibility), the script will exit. Wrap the sed call with || true or guard it with if [ -f "$UNDO_STACK" ]; then ... fi so cleanup failures don't abort the test run.
| sed -i.bak '$ d' "$UNDO_STACK" 2>/dev/null; /bin/rm -f "${UNDO_STACK}.bak" | |
| if [ -f "$UNDO_STACK" ]; then | |
| sed -i.bak '$ d' "$UNDO_STACK" 2>/dev/null || true | |
| /bin/rm -f "${UNDO_STACK}.bak" | |
| fi |
| fi | ||
| # Clean up | ||
| /bin/rm -f "$trash_file" "$trash_dir/info/${trash_name}.info" 2>/dev/null | ||
| sed -i.bak '$ d' "$UNDO_STACK" 2>/dev/null; /bin/rm -f "${UNDO_STACK}.bak" |
There was a problem hiding this comment.
Same issue as above: with set -e, a failing sed -i.bak here will abort the entire test script even though this is just best-effort cleanup. Please guard the sed invocation or append || true.
| sed -i.bak '$ d' "$UNDO_STACK" 2>/dev/null; /bin/rm -f "${UNDO_STACK}.bak" | |
| sed -i.bak '$ d' "$UNDO_STACK" 2>/dev/null || true; /bin/rm -f "${UNDO_STACK}.bak" |
| if [ -f "$target" ] && [ ! -L "$target" ]; then | ||
| sha256sum "$target" | cut -d' ' -f1 | ||
| elif [ -L "$target" ]; then | ||
| readlink "$target" | sha256sum | cut -d' ' -f1 | ||
| elif [ -d "$target" ]; then | ||
| (cd "$target" && find . -type f -print0 2>/dev/null | sort -z | xargs -0 sha256sum 2>/dev/null | sha256sum | cut -d' ' -f1) |
There was a problem hiding this comment.
compute_checksum uses pipelines like sha256sum ... | cut .... In shells running with set -e and especially set -o pipefail (common in scripts), failures inside these pipelines (missing sha256sum, unreadable files, etc.) can cause the caller’s script to abort mid-operation. Consider making checksum computation explicitly best-effort (e.g., check command -v first and/or append || true, redirect stderr) and return an empty checksum without failing the overall rm flow.
| if [ -f "$target" ] && [ ! -L "$target" ]; then | |
| sha256sum "$target" | cut -d' ' -f1 | |
| elif [ -L "$target" ]; then | |
| readlink "$target" | sha256sum | cut -d' ' -f1 | |
| elif [ -d "$target" ]; then | |
| (cd "$target" && find . -type f -print0 2>/dev/null | sort -z | xargs -0 sha256sum 2>/dev/null | sha256sum | cut -d' ' -f1) | |
| # Best-effort: if sha256sum is not available, skip checksum computation | |
| if ! command -v sha256sum >/dev/null 2>&1; then | |
| return 0 | |
| fi | |
| if [ -f "$target" ] && [ ! -L "$target" ]; then | |
| # Regular file: compute checksum, ignore failures | |
| (sha256sum "$target" 2>/dev/null | cut -d' ' -f1) || true | |
| elif [ -L "$target" ]; then | |
| # Symlink: require readlink; if missing or failing, skip | |
| if command -v readlink >/dev/null 2>&1; then | |
| (readlink "$target" 2>/dev/null | sha256sum 2>/dev/null | cut -d' ' -f1) || true | |
| fi | |
| elif [ -d "$target" ]; then | |
| # Directory: require find and xargs; ignore failures inside the pipeline | |
| if command -v find >/dev/null 2>&1 && command -v xargs >/dev/null 2>&1; then | |
| (cd "$target" 2>/dev/null && \ | |
| find . -type f -print0 2>/dev/null | \ | |
| sort -z | \ | |
| xargs -0 sha256sum 2>/dev/null | \ | |
| sha256sum 2>/dev/null | \ | |
| cut -d' ' -f1) || true | |
| fi |
Compute a content checksum at backup time and store it in the .info file.