From the title of this post you have probably already figured out that I wasn’t successful in tracking when the PDFs on the Women’s March Unity Principles page changed. It’s always less fun to document when something doesn’t work the way you wanted, but I’m doing this in case it’s useful for anyone else.
These words of wisdom have helped me through this week:
Your feminism is either intersectional or it is garbage
— Lorelei Lee (@MissLoreleiLee) January 18, 2017
Why was I even trying to do this?
It was easy to set up Versionista to track changes to the Women’s March Unity Principles webpage. On this page there’s a link to a longer PDF document. I wanted to be able to save the various versions of the full PDF statement and then compare the different versions to see what changes happened. I know that this document has also changed because people have screenshots of various version. Also, this document used to be 5 pages and now it’s 6.
This started as a place for me to put my anger around sex workers being thrown under the bus by the Women’s March. In watching the changes to the website I also saw how “disabled women” was added to the first paragraph of that page. To me, the changes in language (additions, deletions, changes) illustrate power struggles within this movement. I’m so curious about the politics behind each edit.
Library technology colleagues are awesome
I’m really lucky to work with library technology colleagues who are smart, curious and generous. A big thank you to Peter Binkley for his time tweaking a script he had written to email him updates to the bus schedule when the PDF schedule was changed. Peter made some changes of his script to email both of us changes to the PDFs on the Women’s March site. Unfortunately that didn’t work as the name of the PDF and the location of the file kept changing.
Coming out as a former sex worker is the scariest thing I’ve done professionally. My big fear is that the people I work with (both at my workplace and in the Access and code4lib communities) would dismiss or shun me and the work that I do. These communities are really important to me, and it’s been amazing to have colleagues offer their technical smarts and support. When Christina Harlow suggested I could put the PDFs in GitHub and that she and others would help run comparisons and share the change outputs I found myself in crying on the bus.
Being clear that I am a former sex worker (and a feminist and a librarian) positions me in a unique place to be making these critiques of the Women’s March. Librarianship is not neutral, and neither are the changes to Women’s March Unity Principles. Being out is also necessary to be trusted by some sex work activists–I’m not a researcher who wishes to study sex workers, I have this lived experience. While I have experience doing feminist activism, I have very little experience doing sex worker activism. It’s felt good to put my librarian skills to use in service of sex worker rights and supporting sex worker activists.
How to see what has changed in 2 versions of a PDF
There were 3 excellent suggestions from colleagues:
- Sean Hannan suggested pdfdiff I didn’t end up trying this in the end. I’m not comfortable working in the command line, but I thought this didn’t seem daunting, but the other tools worked better.
- Bethany Nowviskie suggested Juxta Commons, a tool she helped build.
- Carmen Mitchell suggested using the Compare Documents functionality built in to Adobe Reader Pro
For a free, web based tool Juxta Commons does a lot and is easy enough to use.
According to the 4 year old video Juxta Commons can only accept plain text or XML, according to the documentation it accepts more file types now: HTMl files, Microsoft Word DOCX, Open Office, EPUB and PDF. I didn’t realize this so did the unnecessary step of converting the PDFs to text files using Omnipage.
I liked the different comparison tools. The heatmap shows where changes have happened and there’s icons to identify things that have been added, deleted or changed. For me the side by side comparison was the most useful. The histogram was also useful to see all of the changes on more of a macro level. This is how I realized that I was comparing different copies of the same version of the PDF.
Adobe Acrobat Pro – Compare Documents
I’m glad Carmen reminded me of this as I had forgotten it was there. This was pretty straightforward. You tell Adobe Acrobat which PDF is the newer one and which is the older one, tell it which pages you want to compare, and then pick from 3 different document layout types: 1) reports, spreadsheets, magazine layouts; 2) presentation decks, drawings, illustrations; 3) scanned documents.
Again, I was unknowingly comparing 2 copies of the same PDF and it found no changes.
Juxta Commons is way more useful, but most people already have Adobe Acrobat on their computer. If I had a bunch of documents to compare or was going to do this more than once I’d recommend using Juxta Commons.
Today Trump was inaugurated as the US President. Already his government is making radical changes to what information is on the White House website, including removing the LGBT rights page, and removing pages on civil rights, health care and climate change. As librarians we have some useful skills that we need to use to resist fascism and foster the social change we want to see.
Be careful with each other so we can be dangerous together.