Today I needed to move 435 TIFF images from the San Marcos Daily Record Negative Collection into folders based on a section of their filename.
The filenames include the following sections: SMDR_1959-185_001.tif
- SMDR, the collection code for the items digitized
- 1959, the year the negatives are from
- 185, the physical folder the negatives are from
- 001, the index number for the image
- tif, the file’s extension
If there is more than 1 negative in folder 185, then the 2nd image will get index number 002, i.e. SMDR_1959-185_002.tif.
Some of the TIFF images that needed to be sorted into folders
We organize the images in digital folders to match the physical folders, so every image in the SMDR collection from 1959 in physical folder 185 needs to be in a digital folder 185. The folder therefore only needs sections 1-3 of the filename, i.e. SMDR_1959-185.
I have the images and I know what my digital folder names need to be, but if I did all of this work by hand it would not only take a significant amount of time, I would probably make a few mistakes. This type of repetitive work is better handled by the computer because it will not only create the folders and move the files more quickly, it is much less likely to make mistakes (assuming I provide the proper instructions).
This was very quickly done in Python 3.6 on Windows 10 using less than 20 lines of code:
dir = "1959_6x6/"
for file in os.listdir(dir):
# get all but the last 8 characters to remove
# the index number and extension
dir_name = file[-8]
dir_path = dir + dir_name
# check if directory exists or not yet
if not os.path.exists(dir_path):
file_path = dir + file
# move files into created directory
This code doesn’t use best practices as I should be creating the directory path differently, but hey, it worked and I was able to create the directories, move the files, and write this blog post in significantly less time than it would have taken to do it manually.
I’ve been finishing up a project that requires a MARC record to be copied into 2 different locations and renamed based on directory names.
This script was written and tested on Windows 7 in Cygwin using the BASH Shell.
In my last post, I rotated 12 images with non-contiguous filenames like 91, 115, 185. This time, though, I have a set of images with contiguous filenames so I don’t need to type each one in by hand.
I rotated the images with ImageMagick version 6 in Cygwin’s BASH Shell using 1 line of code.
I’m on one of our older, Windows 7 computers and need to rotate 12 images 90 degrees clockwise, but I know Adobe Photoshop won’t play well with my 1-bit bitonal scans.
I rotated the images with ImageMagick version 6 in Cygwin’s BASH Shell using 2 short lines of code.
We recently investigated how long exposures during digitization might be affected by different physical locations. We tested 4 locations and 2 shutter speeds to capture the blur induced by walking heavily around the copy stand during exposure.
The 4 areas tested were:
- Concrete Foundation at the ARC
- 7th floor of Alkek Library in the Corner of the Building
- 7th floor of Alkek Library in the Center of the Building
- 2nd floor of Alkek Library on a Raised Floor installed over carpet for data & power lines
We tested 2 shutter speeds that are representative of those we use to digitize film negatives using Artograph LightPad Pros:
- 1/10th of a second
- 1 second
I’m currently sitting in front of one of our Windows 10 computers that does not have Cygwin or Git BASH installed so there’s no way for me to quickly rename TIFF files from command-line like I am used to . . . enter Windows PowerShell!
I need to rename the files according to the formula <filename_stub>_<###>.tif such that a file like SMDR_1950s-SF-37_May-17-2017_12-51-19.tif becomes SMDR_1950s-37_001.tif. In this case, the <filename_stub> is SMDR_1950s-37 and <###> is 001.
So I wrote a little Windows PowerShell script to help out.
This post builds on my last one: Searching the Library of Congress with Python and their new JSON API, which is why I’ve added Part 2 to the end of the title. Before we dive back into the Library of Congress‘s JSON API, some housekeeping items:
- Even though the Library of Congress’s website is loc.gov, the abbreviation for Library of Congress is LC
- I tried to find a press-ready image of NBC‘s old The More You Know logo I could add here, but
- the updated logo doesn’t make me hear the jingle in my head
- I did find Megan Garber‘s 2014 article covering the PSA series for The Atlantic that has some classic video I enjoyed
- As of October 2017, LC has expressly stated in a disclaimer that their JSON API is a work in progress. Use at your own risk!! We might (will likely) change this!
Recap on Stryker’s Negatives Project
I recently came across Michael Bennett‘s article Countering Stryker’s Punch: Algorithmically Filling the Black Hole in the latest edition of code4lib: <– GREAT STUFF!
He’s using Adobe Photoshop and GIMP to digitally restore blank areas in images due to a hole punch having been taken to the physical negative.
Use the Library of Congress’s JSON API to download all of the hole punch images and their associated metadata.