Archive for October, 2010

Looking through this week’s set of readings the uses and abuses of metadata and tagging, I was reminded of my longstanding discomfort with the nature of some academic critiques of new technologies.  Google’s Book Search: A Disaster for Scholars by Geoffrey Nunberg is an illustrative example. A well-respected linguist who publishes and speaks frequently in the mass media, Nunberg bemoans the primitive state of metadata across the books that make up Google’s massive as a “train wreck” for scholarship.

To be certain, Nunberg performs a valuable service in communicating to a broad audience some real limitations to the technology. He does a great job presenting his findings about systematic errors in the metadata provided by Google. He shows the horrendous and hilarious erroneous results provided, presumably, by a Google search by the publication year of 1899. He also points out a large number of misdatings of popular and important books. In fact, according to Nunberg, his unscientific sample produced an error rate on publication date of 70%. He goes on to reveal an array of undeniably silly classification errors in classic texts: ‘Moby Dick’ labeled Computers! ‘Jane Eyre’ labeled Architecture! For digital historians and archivists, it’s a powerful demonstration of the importance of good metadata.

Nunberg also does a good job pinpointing a major source of these problems: the machine-based scanning and data harvesting system used by Google. The problem is that this is the one and only explanation he offers with any substance – perhaps not entirely unrelated to its coming directly from the mouth of Google employees – and his prescriptive vision suffers as a result. In the place of serious analysis, he resorts to speculations about Google’s lack of cultural sophistication tinged with more than a little elitism. “I have the sense that a lot of the initial problems are due to Google’s slightly clueless fumbling as it tried master a domain that turned out to be a lot more complex than the company first realized,” he writes, perhaps imagining that nobody at Google has been exposed to literature. He triumphantly accuses Google of making a wrong choice in classification systems when they went with BISAC, a classification system often used by commercial retailers, as their standard. After briefly ruminating about a potential usefulness when it comes to ad placements, he sniffs with indignation that this system underwhelmingly distorts the weight of important books because “Bambi and Bullwinkle get a full shelf to themselves, while Leopardi, Schiller, and Verlaine have to scrunch together in the single subheading reserved for Poetry/Continental European.” It’s not difficult to imagine Nunberg looking down his nose as he says, “Google has taken a group of the world’s great research collections and returned them in the form of a suburban-mall bookstore.”

In the end, we are left with a couple of limp possibilities. Nunberg has hope that organizations like the Internet Archive or a consortium of libraries called HathiTrust may “pick up the slack” in his words. Most importantly, says Nunberg, Google should be motivated to license metadata from the Library of Congress and OCLC out a sense of obligation to demonstrate its claim that Google Books is a “public good” or to avoid becoming “a running scholarly joke.” Even if we overlooked the ethnocentrism of the former idea and the naivete of the latter idea, it’s clear that Nunberg seems to have missed the memo about Google: it’s a company, a very big and very profitable company. It makes money by providing products to the public for “free” and wrapped in the cozy rhetorics of freedom, access, knowledge and fun, all the while making money hand-over-fist by tracking and scanning our every online move such that they can charge advertisers premium prices for ‘targeted ads.’ This is why it’s ironic when Nunberg says condescendingly, “It’s clear that Google designed the system without giving much thought to the need for reliable metadata.” He is very, very correct, but for reasons very different from the ones he imagined. Had he made an effort to look past Google’s promise to build the world a free library, he would have noticed Google magically turning public and private property into another wildly successful profit-generating engine, one that turns a profit day and night, regardless of whether naïve or elitist academics think Google Books is good enough for scholarly research.


Read Full Post »

A former classmate of mine, in her research on Small Pox during the Revolutionary War, introduced me to Harvard University’s online collections; and I revisited them again this year to see if they could at all be useful to my research. Harvard’s library offers a variety of online material- mostly in the form of digitized versions of their print collections.  In addition, they offer subject specific and thematically arranged ‘collections’ as a part of their Open Collections Project.   Material is pulled from various collections in the Harvard Library and arranged to create a new, subject collection.

The program began in 2002 as part of a larger effort to get material online and in a form that would reach the largest amount of people.  The head of Harvard’s library, Robert Darnton, says the mission of the project is for “Harvard to share its intellectual wealth with the rest of the world.”  A bold mission: but Harvard delivers a pretty user friendly product.

The Open Collections Program(OCP) offers six collections, created based on potential research interest including:

  • Reading, Harvard’s views of readers, readership and reading history
  • Islamic heritage project
  • Expeditions and discoveries: sponsored exploration and scientific discovery in the modern age
  • Contagion: Historical views of disease and epidemics
  • Immigration to the U.S. 1789-1930
  • Women Working, 1800-1930

I found both the Working Women collection and the Immigration collection to have some documents related to New York City, and to Greenwich Village.

Each collection has its own site where you browse and search the material.  You can search in various different ways: by creating organization, form or recognizable names.  Each collection provides a scope and content note as well as copyright and permissions information.  As part of their goal of reaching a wide audience, the collections also come with teacher resources so they may be used in the classroom.

Each item comes up in an image viewer, in which you can zoom, and convert the item to a PDF for printing. (10 page limit!) You can also flip through the table of contents provided in the sidebar, or turn pages back and forth.

Searching through the collections can be a little overwhelming if you are not going in with a specific search term in mind, however, the collections do offer some interesting documents.  Even more than that, they offer an overview of their holdings on any of the specific topics, making this a good starting point for primary source research.

If you want to check out any of the collections, or get more detail about the project and funding, visit Harvard University’s Open Collection Program website.

Read Full Post »

Tagging Anarchy

Thinking of “tagging” as a way to classify and group all information on the web is quite overwhelming. As much as I get the impression from Tagging that the people behind the various schemes and systems, the whole universe of tagging seems to me still too anarchic and unorganized. Why does this matter? Perhaps it doesn’t, it just makes my brain hurt.

The Internet is largely a vast ungovernable landscape. While tagging is perhaps the best way we have come up with to organize news, opinion, products and just about everything else, I’m reluctant to accept it as convention until there are fewer cooks in the kitchen.

The point of many tagging schemes, according to the Gene Smith book, is that having so many users creating, organizing and grouping tags means that lesser-used or “incorrect” tagging conventions are likely to drop off, leaving some sort of consensus behind. Like conducting a large poll or study, the outliers will be so few that they won’t affect the big picture.

However, I think that like with many forms of communication, there ought to be conventions when it comes to tagging. Some web sites do have staff who regulate and review tags, but on others, that task is up to the user. Internal regulation should be done across the board. Smith’s book gives good examples. There shouldn’t be 34 tags that mean “science fiction,” nor should “Marshall Mathers,” “Eminem” and “Slim Shady” point users to different items.

More work is being all the time done behind the scenes to create new ways of cataloguing our online information, but I hope these developments will lead to putting someone in charge.

Read Full Post »

Archives Unbound

A fantastic tool for any historian or archivist is Archives Unbound, an online archive of digitized primary sources. The website is particularly unique as it utilizes what it calls an “intuitive search platform” that allows the user to enter words and phrases and receive organized, relevant results. Their collection of works is extensive, and includes everything from Supreme Court records to FBI documents. Archives Unbound also provides a space for users to suggest archives that they would like to see digitized, stating, “It isn’t every day a researcher has the opportunity to help create the digital collection of their dreams. But with Archives Unbound you can help create new resources from the ground up”. It isn’t often that one encounters an archive so eager to receive input from the public. The website was originally designed to assist librarians, but its success as a search engine makes it a vital tool for anyone doing research.

Read Full Post »

Jane Cunningham Croly

In April 1868, a banquet honoring Charles Dickens was thrown at Delmonico’s restaurant (on the corner of 14th and Fifth) by The New York Press Club. Jane Cunningham Croly, who wrote under the name Jennie June, was not present at this event.

Stories vary as to why. Some say she was a member of the New York Press Club and they refused to add her to the guest list because she was a woman. Others say that the Press Club would not admit her as a member because she was a woman.

Whatever the truth was, Cunningham had had enough of it. She invited other career-minded women to her home on Fourteenth street so they could discuss ways of fighting discrimination.

Sorosis, the nation’s first professional women’s organization, was founded on March 21, 1868. Counted among members were  poet Alice Cary, who was the club’s first president, and writer Franny Fern.

Sorosis was likely named after the botanical name for a fruit that is formed from the ovaries or receptacles that are meshed together on a crowded stem (such as a pineapple), or from the Greek word soror, which means sister.

The club was a precursor to both the New York Women’s Press Club (1889) and the General Federation of Women’s Clubs (1890), and one of their early accomplishments was organizing a similar banquet where New York men and women attended as social equals. From what we have read and posted so far, it is not surprising that this bold, forward-thinking club was the brainchild of a Greenwich Villager.

Read Full Post »


I finally was able to locate Allen Tannenbaum’s New York in the 70s in Bobst (it turns out oversized F is in the northeast corner of the 7th floor) and it was worth the wait. It’s oversized, it’s fun and it’s full of big oversized photos by the former Soho Weekly News photographer. As a friend at work said, “It’s just a big excuse to publish a whole bunch of pictures of celebrities.” And that is true to an extent. There might be a few too many pictures of Mick and Bianca Jagger, but there is also some great photojournalism: a demonstrator in a crowd holds up the iconic Daily News headline, “Ford to City; Drop Dead”, a grade school kid from the Lower East Side jumps from a makeshift platform onto a pile of mattresses surrounded by the broken windows of abandoned tenements, the first gay pride parade heading uptown on sixth avenue. In addition there are some excellent short essays that touch on some of the more salient political and cultural events of the decade. See some galleries here:

http://www.sohoblues.com/soho_blues.html (The galleries are in the Soho Blues box)

And while I was looking for links to the galleries I happened to run across Yoko Ono’s website.


Tannenbaum photographed the famous shoot for Lennon’s last record. It’s not really that relevant other than it shows you’re never too old to be digitally together. It’s so clean I thought it was an Apple product page for a while

Read Full Post »

Helen Gee’s Limelight, as the subtitle reveals quite plainly, is a memoir about a Greenwich Village photography gallery and coffeehouse in the 1950s located on Seventh Avenue South at Barrow Street. It is, however, three stories rolled into one.

On one level it is the story of a single mother, whose daughter is the product of an interracial marriage (the father having succumbed to mental illness), who decides one day to open her own business – in 1954.

That year “Father Knows Best” was on TV and Elvis cut his first record. It was also the year of the hydrogen bomb test, the Brown v. Board of Education desegregation and Rosa Parks, and the enactment of social security legislation. Senator MacCarthy was riding high on an anti-communist wave and the phrase “Under God” was added to the Pledge of Allegiance. Ellis Island closed as an immigrant entry point to America.

In this context, Gee recounts the ups and downs of raising a child and being an entrepreneur at a time when respectable women simply didn’t do such things. Gee presents a personal, firsthand account of the challenges, such as living beneath a trio of bookies, run-ins with labor organizers, resisting getting a television set in the face of her daughter’s claim to be underprivileged, and hiding her second marriage from her Irish in-laws because of her previous divorce.

On a second, yet as important level, her memoir is also the story of photography as an art form in America. The Limelight was the first, and at the time only, gallery dedicated to photography. During its seven years of existence it hosted 70 exhibitions, including works by Minor White, Alfred Steiglitz, Berenice Abbott, Ansel Adams, and Robert Frank. Photography was such a new artistic commodity that at the end of 1954, when the Limelight held a group exhibition, the photographers didn’t know how to price their work since there was no precedent. By and large, none of them had ever sold to collectors. In 1955 photography finally was given it due as an art form when Edward Steichen’s exhibition “The Family Man” opened at the Museum of Modern Art.

Finally, her narrative chronicles to some degree the changing social-cultural landscape of Greenwich Village in the 1950s. She recounts how one of the bouncers refused entry to anyone wearing blue jeans or dungarees since such style choice was a sign of low character. Ironically, more than a few photographers had to use the back door to gain entry. Another change she notes is the surge of what were considered “bohoes” (Bohemian hoboes) who “sprawled around the fountain in Washington Square, straggled along MacDougal and the neighboring side streets, squatting on curbs and in doorways. They were…something of a tourist attraction” (p.71). But Gee makes clear that these rebellious youth lacked the charm of old-time Village Bohemians and were not entirely desirable within the artistic community. Hence the overzealous bouncer who’s initial task was clearing them out of the Limelight.

Overall, Limelight is an easy, engaging read that not only tells a personal story but captures a unique time, place and the people who made it so.

Read Full Post »

Older Posts »