One of the bigger challenges for librarians is protecting the privacy of patrons while stopping them from stealing the books. In many cases, there's not much interesting in our reading choices, but sometimes there is. A spy might look for hints or clues in the list of books taken out by researchers from the nearby army base. A blackmailer may try to subvert some of the local police or security personnel by looking at what they read. This may be why some librarians are so careful to protect the choices of their customers.
Is it possible to protect the reading choices of library patrons from hackers, insiders, and snoops while catching thieves? At first glance, this seems difficult because the library must keep track of the books on loan to defend itself against people who don't bring them back. Some libraries try to delete all records after a book is returned, but that doesn't stop the curious from looking at the list of books that are currently checked out.
The surprising result is that the library doesn't need to keep a list of what people are reading to stop theft. A few simple one way functions can lock out even the most adept snoops. (A good one-way function is the Secure Hash Algorithm or SHA and many toolkits now come with implementations that implement it and a more general, metaprotocol, the HMAC.)
One way functions scramble information so it is unreadable, but they don't remove all of the usefulness. If you want to see what functions like SHA do to text, you can try to scramble some book data here with this Javascript :
The results should be inscrutible if the one way function is working correctly. There should be no way to take the results of The trick is to pass the book title and the author through the one-way function, SHA(), before storing it away. That is, put SHA("All's Well That Ends Well/William Shakespeare") in the database instead of just the plaintext title "All's Well That Ends Well". Here's what a librarian's database might look like:
Name |
SHA(book title) |
Due Date |
Replacement Cost |
Bob Jones |
19ded208e1d4f03f18f54bfead142edb3971632c | Jan 1 |
$20 |
201a0d9e68c174c0a8664b4d8510204ccad5583d |
Jan 3 |
$21 |
|
873e637cc6d6eb2466ff2d0e02da8a54d1103cb7 | Jan 3 |
$25 |
|
Mary Jones |
51f33f597ecef6886e5426a304a2dec9b7b248f8 |
Jan 2 |
$15 |
4f697e91504dd8754afa0c9d29e040fb94b9e52c |
Jan 3 |
$15 |
This table tells us that Bob Jones has out some book due on January 1. If he doesn't bring it back, he will owe $20. When he does return it, the library can examine the title, compute SHA(title/author), and delete the entry. Bob Jones is relieved of his responsibilities. If he doesn't return it, he can be billed.
This solution does have a few weaknesses. SHA(title/author) may be easy to guess. Someone can take the list of titles from Amazon or Books in Print and look for matches. There aren't too many books. Another solution is to give each book a unique, random ID number, something many libraries do already. If the number is long enough and chosen at random, then it is not possible for a spy or a blackmailer to try to guess the titles. Another solution is to add a password to the hashing equation by storing SHA(title/author/password) or SHA(title/author/patron's name). This significantly increases the complexity of any brute force attack, although it does not
Removing the Reader's Name
The system can be made a bit more secure by also locking away the identities. Instead of storing the name in the column, the system can store SHA('name') or better SHA('random id string'). When a person returns their book, they can present their libary card
to delete the book from their record.
SHA(Name) | SHA(book title) |
Due Date |
Replacement Cost |
06fda7eb8124bc3fa05ed70feb92c1c157d8fb23 |
19ded208e1d4f03f18f54bfead142edb3971632c | Jan 1 |
$20 |
201a0d9e68c174c0a8664b4d8510204ccad5583d |
Jan 3 |
$21 |
|
873e637cc6d6eb2466ff2d0e02da8a54d1103cb7 | Jan 3 |
$25 |
|
6980b62ba56b08d90e1f6103fdc42391856d12dc |
51f33f597ecef6886e5426a304a2dec9b7b248f8 |
Jan 2 |
$15 |
4f697e91504dd8754afa0c9d29e040fb94b9e52c |
Jan 3 |
$15 |
Removing the person's name from the record would require taking some anonymous bond or deposit for the loan-- a practice that might be too unwieldy. If people post a cash deposit with each loan, then it would be possible to remove the identity completely.
The library could also use some form of reputation database to see who was trustworthy and who wasn't. This isn't much different from what they do now.
If anyone has thoughts about the advantages and limitations of the approach taken here, please write. --- Peter Wayner, p3 (a) wayner (dot) org