Matt Miller

About

âœșMetadata

  • WoodBlockShop
    2024-12-16
    Using Segment Anything, LLaVA and other methods on a 14K image corpus
  • Banned Metadata
    2024-09-27
    Analysis of PEN America 2022-2023 Banned and Challenged Book Metadata
  • Lomax Folk Song Collection & Whisper Transcription
    2024-09-12
    Automatic transcription of Alan Lomax’s Midwest Folk Song Collection using Whisper.cpp
  • Using GPT on Library Collections
    2023-03-30
    Use cases for applying GPT3/3.5/4 on a full text collection
  • Animated Gifs in US Elections
    2021-12-01
    Mining the Library of Congress web archives for political Gifs
  • Wiki2MARC
    2021-06-25
    Working on a tool to build name authority MARC records from wikidata entities
  • Pomodoro - OCR Tool
    2020-12-17
    A tool that makes extracting text from complicated documents easier
  • Non-Renewed Copyright Analysis
    2020-07-22
    Analysis of non-renewed works from 1923–1964 that are likely in the public domain
  • Smithsonian Open Access
    2020-02-28
    Analysis of the Smithsonian open data release
  • Wikidata on id.loc.gov
    2019-05-22
    I connected Library of Congress authority records to Wikidata
  • Triple Builder
    2018-07-28
    Like Github’s gist but for random RDF graphs. A spreadsheet RDF triple editor
  • Internet Archive book list
    2018-03-09
    Worked at Internet Archive as a contractor building a big book list
  • LCC is made of people
    2017-12-04
    Individuals found in ranges of LCC.
  • Mapping LCC to Wikipedia via LCSH
    2017-10-29
    Connecting Wikipedia to LCC via LCSH :|
  • Library of Congress Holdings by LCC
    2017-10-11
    A force diagram of resources held at the Library of Congress by LCC classification
  • Analyzing 20 Million FCC Net Neutrality Comments
    2017-09-22
    Looking at the public comments for FCC's attempt to end Title 2 protection
  • Diagramming Bibframe 2.0
    2017-04-21
    A large diagram of Bibframe
  • Zines vs. Google Vision API
    2017-04-17
    Using commercial APIs to extract metadata about zine content
  • CIDOC vs Solr — Steamrolling your model
    2017-04-10
    Looking at using a complex model like CIDOC in a search index
  • Available Bulk Records
    2017-03-17
    I made a list of available cultural heritage bulk lists.
  • BILLI / Nomadic Classification
    2016-01-26
    Exploring how legacy inhouse classification systems at NYPL can be leveraged using linked data.
  • MARC Schema as JSON
    2014-03-11
    Using JSON to describe the MARC standard.
  • Embedded Metadata Explorer
    2011-06-01
    A web tool to embed data, like a dublin core record, into digital assets. Was MILS class project and DCMI poster paper

âœČCultural Heritage Systems

  • id.loc.gov update
    2018-11-09
    I work on id.loc.gov at Library of Congress in 2018/19 I did a big update to the site design.
  • data.carnegiehall.org
    2018-01-01
    I worked as a contractor building Carnegie Hall's linked data infrastructure and data portal
  • Registry
    2016-06-01
    The Registry was large metadata into linked data aggregation project at NYPL. View the presentation or read the blog to learn more. The video was a demo of the front-end of the system.
  • “Florentine Renaissance Drawings” Digital Project
    2015-06-11
    A Linked Data system presenting three editions of Bernard Berenson’s catalog of Florentine painters.
  • NYPL Archives
    2014-01-01
    I spent two years building NYPL's archive portal.
  • OSU VRL System
    2010-06-01
    In 2009 I worked for Ohio State University building a system for their Visual Resource Library

✼Wikipedia / Wikidata / Wikibase

  • Migrating Your Docker Wikibase
    2024-03-20
    How-to migrate a Docker Wikibase to a new server, step-by-step tutorial
  • Variants: Comparing LCSH alt labels and Wikipedia redirects
    2019-09-25
    Both Wikipedia and LCSH have alternative labels, comparing them.
  • Installing Wikibase Tutorial
    2019-09-03
    Video tutorial on installing Wikibase
  • Analyzing DOI Citations in English Wikipedia
    2018-04-20
    Analysis of the journal articles cited across all of english Wikipedia
  • Analyzing Books Cited in English Wikipedia
    2018-04-08
    Analysis of the books cited across all of english Wikipedia
  • Wikibase for Research Infrastructure
    2018-03-09
    Using Wikibase as a platform for your research project
  • Mapping Wikidata to Bibframe
    2018-01-24
    A look at how a Bibframe record could be represented in Wikidata

✯Data visualization

  • Hathi Trust Public Domain Viz
    2023-01-01
    Browse new resources entering the public domain in Hathi Trust
  • Leeks: ParkMobile
    2021-06-09
    Using the ParkMobile data leak to explore what nick names people give their cars
  • Party Platforms
    2020-08-27
    A look at new words found in the Democratic and Republican party platforms
  • LCC Tree Map
    2020-07-29
    A Library of Congress Classification Tree Map for LC, Harvard and Columbia.
  • Library of Congress Web Archive
    2020-03-23
    Viewing all 21K LC archived web pages at once
  • Vectorizing the DPLA
    2017-12-19
    Using the DPLA metadata records as a source I vectorized their records to see if records aggregate based on textual features regardless of the metadata content.
  • List of all books @ Library of Congress
    2017-05-22
    A large list of all the books at the Library of Congress
  • RFC — Visualizing Internet History
    2017-04-20
    Visualizing the entire corpus the the RFC (Request for Comment) archive
  • Knight Challenge Elections
    2015-03-24
    Analysis of the submissions to a Knight Foundation challenge
  • LinkedJazz Tulane
    2015-02-15
    Prototype network viz for tulane contentDM metadata to RDF
  • Knight Challenge Libraries
    2014-10-16
    Analysis of the submissions to a Knight Foundation challenge
  • The Networked Catalog
    2014-07-05
    Rendering the entire NYPL catalog as a subject network graph.
  • All of NYPL's finding aids printed out
    2014-02-03
    Large print visualisation of all of NYPL’s finding aids (10K). Going away present.
  • NYPL Network Renderings
    2013-10-13
    Timelapse network renderings of NYPL’s catalog.
  • 1001 NYPL Finding Aids
    2013-08-29
    Proof of concept to render 1001 NYPL finding aids as one document.
  • Moretti-izer: Geometric Literature
    2012-09-01
    An old experiment with Franco Moretti's idea
  • Linked Jazz Network
    2012-06-01
    A network visualization of the relationships of jazz musicians based on historical oral history transcripts converted to linked data.

✰Creative Coding

  • LCNAF Anagrams
    2024-04-01
    If you have +11 million names, like in the LC Name Authority File, how many of them anagram to each other?
  • Ley Lines
    2022-01-17
    Drawing lines between things, mondern ley lines.
  • Wordle Shares
    2022-01-09
    Collecting Wordle Shares on Twitter
  • Jan 6th Overflow
    2022-01-06
    Livestream comment simulation
  • Endless Hallway of 2007 GIFs
    2021-12-01
    Wander down a hallway of 12,000 animated gifs from 2007
  • Every Arbys Twitter Bot
    2021-06-15
    Order some curly fries for me
  • Between The Places
    2020-12-22
    Inspired by Sol LeWitt, make a map cutting out the places you used to live
  • ISBN Uh-Oh
    2020-12-06
    A twitter bot posting ISBN collisions
  • Sun
    2020-07-04
    What did the sun look like on your birthday?
  • Maya Lin's Eclipsed Time
    2020-04-12
    A simulation Eclipsed Time in the age of Covid-19
  • 6 Hour PowerPoint
    2020-02-18
    I put 1000 government PowerPoint presentations together
  • Rodin's Gates of Hell
    2019-09-26
    There are two Gates of Hell sculptures on the east and west coast of America
  • Anaphora - political speech
    2019-05-20
    Exploring repeated phrases in government audio from Library of Congress web archives.
  • Zork
    2019-04-20
    Extracted elements from the text adventure game Zork
  • Byzantine
    2019-03-20
    A little toy to make Frankenstein PDFs from government sources.
  • Warhol Repetition
    2018-10-27
    Extracting the frames in between serial photograph using computer vision stuff.
  • Portals to change - DPLA doors
    2018-06-02
    Portals to change
  • I trained a neural net on...
    2018-06-02
    Dumb neural net thing, I was sick of seeing people posting the things they are training
  • Effect Bath - BBC Sound Twitterbot
    2018-05-23
    A mashup bot using the BBC sound effect library
  • Little LC List Twitterbot
    2018-03-18
    Little lists of similar sounding books at Library of Congress
  • Generative Tropical Cyclones Drawings
    2017-9-9
    Randomly drawing the path of historical storms
  • Public Domain Cut-up Twitter Bot
    2017-02-10
    Twitter bot mashing up public domain images
  • Paint by MARC (numbers)
    2016-04-20
    Paint a picture by MARC record numbers
  • Code4Lib Markov Chain
    2016-04-01
    A Markov chain toy using presentations delivered at Code4Lib 2016
  • Authority Birthdays
    2016-03-26
    What authorities were created on your birthday
  • Emoji Field
    2015-06-01
    When you get heatstroke walking home from trader joes and think about windows screen savers but with emojis.

❂Digital Humanities

  • Repotting old DH projects
    2020-01-31
    Blog post about sunsetting old projects
  • DADAlytics: SĂ©lavy
    2019-10-01
    A tool that organizes the output of the NER service and helps convert documents into RDF triples
  • DADAlytics: NER Service
    2018-01-01
    A tool that combines multiple NER tools into one service
  • DADAlytics
    2017-06-01
    A IMLS grant to build tools to aid in entitiy extraction
  • PADB Fluxus history Site
    2013-04-20
    A site to organize co-occurrence of performance artists. Specifically for the Fluxus movement. Support site for my MA in History of Art and Design from Pratt Institute
  • Linked Jazz 52nd St
    2013-01-01
    Crowdsourcing tool to map the relationships between jazz musicians

※Teaching / Research

  • Programing for Cultural Heritage
    2020-06-01
    I taught data + python to library students at Pratt Institute for 6 years as an adjunct. Checkout the class projects.
  • Semlab
    2018-01-01
    I am co-director of the Semantic Lab at Pratt Institute
  • Harvard LIL Fellowship 
    2017-06-09
    I was a 2017 summer fellow at Harvard’s Library Innovation Lab. I worked on Case Law and linked data
  • LOD Summer School @ University of Bologna
    2016-05-15
    For a few years I co-taught a Linked Data summer school at the University of Bologna in Ravenna Italy.
  • Linked Jazz
    2015-01-01
    A linked data digital humanities project to map the relationships of jazz musicians via oral history transcripts.
  • MA History of Art Deisgn Thesis
    2013-04-20
    My master’s thesis for my MA in History of Art and Design from Pratt Institute. Using network analysis I looked at the Fluxus community and its development 1960-1969.
  • SILS Practicum
    2011-06-01
    My Practicum for my Masters Information Library Science (MILS) degree

〠Drawing

  • Repeated Hunt
    2020-12-09
    Engravings repeated
  • Ostraca
    2020-10-29
    Counting votes in Philadelphia, a video
  • Daily Quar drawings
    2020-03-01
    Drawings under quarantine
  • Italian Street Signs Vol. 1
    2018-07-01
    Street signs from Italy
  • 2018 Notebook
    2018-06-01
    Random drawings from 2018
  • Aoristic
    2012-01-01
    Screen printing projects involving repetition of art and history.

❅Detritus

  • Clickbait machine self-descurction
    2021-02-21
    Can a news page make so many click bait ads even it cannot lift it?
  • Consumer Divination
    2020-05-15
    Buy a computer, change your life
  • List of Wikipedia Emoji redirects
    2019-10-05
    Emoji redirects
  • Garfield
    2019-07-29
    It was national lasagna day or something.
  • Cleveland socialist newspaper covers
    2018-10-29
    Cleveland had a socialist newspaper in the 1920s. I compiled all the covers, I think I was trying to impress a polish person.
  • Patented Screenshots
    2018-06-08
    Some old ass websites
  • Spreadsheet monster
    2018-01-01
    something...about...idk, something to do with @ablwr
  • Frederick Wiseman Documentaries
    2017-09-27
    What types of documentaries did Wiseman make by percentage?
  • Mapping Modern Day Herms
    2017-09-08
    Experiment using mapping tools on my phone
  • Memory Fragments
    2017-07-11
    I went to Venice yallll
  • Matt Miller Club
    2017-06-01
    Hit the club and listen to bone thugs n harmony
  • List of Terror and Relief in Jorge Luis Borges
    2017-05-25
    Close reading is scary yall
  • Moving on from NYPL
    2017-02-17
    Me being polite about a bad situation