Experience | Ryan Adolfs

Historical newspaper pages and metadata processing pipeline

University of Alabama Special Collections Library

Machine Learning Dataset Encoder

Mar 2025 - Dec 2025PythonFAISSXML ETL

At the University of Alabama Special Collections Library, I worked on the design and development of a machine learning and data engineering pipeline aimed at improving how digitized historical newspapers could be processed, described, and searched. A major part of my work involved transforming noisy OCR-derived newspaper text into structured preservation metadata that could be used within the library’s internal archival systems. To do this, I built and extended Python-based workflows that generated semantic embeddings from newspaper transcriptions, used FAISS-based nearest-neighbor retrieval to surface candidate Library of Congress Subject Headings, and incorporated LLM-assisted metadata generation to produce richer subject descriptions and issue-level abstracts. I also developed an ALTO-to-DOB XML transformation pipeline that converted non-native XML formats into the library’s in-house structure, using both positional and semantic logic to reconstruct coherent text segments from existing files. To support that transformation process, I engineered XML-to-dataset ETL and trained a 6-class machine learning classifier on roughly 33,000 XML-derived embeddings to automatically label document segments, making large volumes of previously incompatible archival material usable within the broader pipeline. The practical impact of this work was significant: it improved the searchability and usability of preserved newspaper collections, enabled the department to take advantage of faster and lower-cost XML generation sources, and helped turn a labor-intensive archival workflow into a more scalable, automated, and production-oriented system.

Blackjack++

Software Engineer

Jan 2024 - May 2024PHPMySQLJavaScriptAgile

Blackjack++ was a full-stack casino web application I built with a team of four other developers in an Agile environment. The project featured animated implementations of Blackjack, slots, and Roulette, and gave me the opportunity to work across both frontend and backend development. I developed backend functionality in PHP, used MySQL for database interactions, and incorporated AJAX to support real-time client-server communication. On the frontend, I helped build a responsive and interactive interface using JavaScript, HTML, and CSS to create a smoother gameplay experience. Just as important as the technical work, the project also strengthened my experience collaborating within a software team: we used weekly stand-ups, backlog planning, and iterative development cycles to coordinate work and refine features over time. Overall, Blackjack++ was a strong end-to-end software engineering experience that let me combine web development, database integration, and team-based development practices in a single project.

Work that shipped useful systems.

Machine Learning Dataset Encoder

Software Engineer