So in the last post I discussed in brief, EAD and XML. ArchivesSpace makes XML EAD documents by exporting the completed finding aid and selecting the XML coding format. There were a few surprises along the way.
When I exported the EAD for the Clarence G. Campbell Collection, I saw the EAD element tags selected. As Tom explained to me, you always have to run an XML document through an editor program ti validate the coding. This ensures that other machines can read the document and that the document can be translated into a web document so that web browsers can access the document. For this task, Tom selected Oxygen Editor (more information on Oxygen Editor's validation component).
While running the Clarence Campbell finding aid through Oxygen, we found that certain parts of the document were not recognized by the validation process. The interesting part was that the tags marked invalid were system generated IDs. These IDs are alphanumeric character strings. When the string began with a number, it was marked invalid; when the string began with a letter, it passed validation.
To check if this could be a bug in the ArchivesSpace system, we decided to contact the help desk and see if anyone else noticed this bug or if we did something wrong. Either way, the simple solution for the moment was to alter the system generated IDs.
Stay tuned for the results of the help desk email.
Friday, August 29, 2014
Wednesday, August 27, 2014
Day 16: Back to ArchivesSpace
After a short break, it was time to return to my work on ArchivesSpace. While working on the grant-funded research project, Tom, the ArchivesSpace administrator had time to review some of the work that had been done on ArchivesSpace and figure out some new aspects and features.
For most of the day, Tom walked me through exporting the ArchivesSpace finding aid into an XML format EAD finding aid and showed me how to read the scripting. I noticed:
The LoC also has a guide to best practices regarding tag use at EAD Best Practices
For most of the day, Tom walked me through exporting the ArchivesSpace finding aid into an XML format EAD finding aid and showed me how to read the scripting. I noticed:
- it is a very similar format to HTML coding, only easier to read and more user-friendly
- Certain tags cannot occur within others. For example:
- the publicationstmt tag can hold
- publisher
- address
- addressline
- (PS: these tags are flanked by "<", ">", and "</" I cannot use these here as Blogger translates them as xml commands and they disappear. For more information on this particular EAD tag, visit the LoC site on publicationstmt.
- What is even cooler about ArchivesSpace is that certain information is auto-populated into the xml EAD finding aid based on repository information entered when setting up the platform.
- Certain EAD tags do not work if entered in the wrong section and field.
- Because of this, you need to know EAD description best practices as well as how they translate to the ArchivesSpace platform.
The LoC also has a guide to best practices regarding tag use at EAD Best Practices
Friday, August 22, 2014
Day 15: American Medical Association Citation style
So for some reason, web publishers just don't seem to know AMA citation style. They use a range of APA, MLA, and Chicago citation styles. I spent the better part of he day sifting and sorting citation styles and reformatting them using excel worksheets.
By the end of the day, I was able to run a mail merge to get them all into AMA citation style. All in all, I completed bios and bibliographies for 19 individuals associated with The Human Genome Project. This project cane to 40 typed pages, averaging 2 pages per bio.
Long day. Glad it is done.
By the end of the day, I was able to run a mail merge to get them all into AMA citation style. All in all, I completed bios and bibliographies for 19 individuals associated with The Human Genome Project. This project cane to 40 typed pages, averaging 2 pages per bio.
Long day. Glad it is done.
Friday, August 15, 2014
Days 12-14: Survey of Legal Professionals Connected to the Human Genome Project
Part of creating a new digital resource is planning and research. What CSHL's Archives is working on for the future is documenting the Human Genome Project. They already completed initial bios and publication research on scientists involved. Now they are up to those professionals on the project's fringe but who may be equally important to the public's understanding and perspective of the project.
On my list are: Journalists, Ethicists and Lawyers.
I started with the lawyers to see how long it would take to properly research each one and how much info would be freely available.
Yes... I do realize that lawyers are easier to identify and research than Ethicists; however, there were only four lawyers and that made a reasonable test group to start with and make a project plan from there.
Well, it took me about an hour per lawyer. To be honest, I did get wrapped up in some of the material and have decided to borrow a book entitled, "The Genome War: How Craig Venter Tried to Capture the Code of Life and Save the World." It details the controversy that wrapped itself around the Human Genome Project headed by James Watson. The controversy was about patenting the scientific discoveries made through the Project--the ethics, legality and morality of patenting something that occurs in nature.
For a great summary of the book, try: Discovery Medicine's Executive Summary
On my list are: Journalists, Ethicists and Lawyers.
I started with the lawyers to see how long it would take to properly research each one and how much info would be freely available.
Yes... I do realize that lawyers are easier to identify and research than Ethicists; however, there were only four lawyers and that made a reasonable test group to start with and make a project plan from there.
Well, it took me about an hour per lawyer. To be honest, I did get wrapped up in some of the material and have decided to borrow a book entitled, "The Genome War: How Craig Venter Tried to Capture the Code of Life and Save the World." It details the controversy that wrapped itself around the Human Genome Project headed by James Watson. The controversy was about patenting the scientific discoveries made through the Project--the ethics, legality and morality of patenting something that occurs in nature.
For a great summary of the book, try: Discovery Medicine's Executive Summary
Thursday, August 14, 2014
Day 13: HGP - Researching Journalists
As part of a the follow up to an awarded grant, I began researching and compiling bios on individuals connected with the Human Genome Project. As the scientists's bios were completed the previous year, I am researching people who were on the fringe of the program: legal professionals, journalists and ethicists.
Methodology:
Some of these individuals are well-known for other accomplishments and some only appear once or twice in documentation.
After completing the test group of legal professionals, my rate per person was one per hour. At that rate of productivity, I should be able to complete the list in 25 hours (or 3.5 days). Let's see how this timeline shakes out.
Resources used:
I start out by searching in Google Books: "first name last name" + "genome"
Resources used:
I start out by searching in Google Books: "first name last name" + "genome"
This search string pulls information from inside numerous out of print books and highlights the pages where the person's name appears. By adding the term "genome" it narrows the search to those books where the person's name and the term genome appear on the same page.
Once inside the book, you can re-search using just the person's name and only pull-up pages where that name appears.
After this I generally search in professional directories such as Martindale Hubbell, LinkedIn, Who's Who, Men and Women of Science, CSHL archives and online catalog, and external special collections.
Later, I do a Google search with the person's name combined with a current/recent job title. This helps to pull up professionally relevant information on the person as well as some publications. To identify further genome-related publications, I utilized PubMed.
Methodology:
In an effort to keep the notes organized, I decided to create 5 sections: Professional (current position, career history, and connection to the HGP (what role he/she played, work accomplished, and notes on important connections as well as how this piece connected to the larger picture); Education; Vitals; Known To; Genome-Related Publications; Connections to CSHL.
Day 11: Refoldering
So today I finished refoldering CGC's collection. The trick to refoldering is to be able to aid the researcher in defining the different documents and packets created by the collection's originator.
My first instinct was that each document received its own folder... WRONG!
In reality only documents where a beginning and end are unclear require folders. So if you can tell where one manuscript starts and another ends, no need for a folder. (Now remember that this is the stage where we also remove all paperclips, staples and fasteners--except for professional bindings).
The great part of this stage is you get to double check your own work while refoldering. You get to make sure that the divisions make sense, that you did not miss one letter stuck to another, and you get to see the finished product and say, "Yes! I did that!"
I gotta say it felt good to see it all refoldered, defastened, and boxed up and labeled.
Good times. See ya next week!
My first instinct was that each document received its own folder... WRONG!
In reality only documents where a beginning and end are unclear require folders. So if you can tell where one manuscript starts and another ends, no need for a folder. (Now remember that this is the stage where we also remove all paperclips, staples and fasteners--except for professional bindings).
The great part of this stage is you get to double check your own work while refoldering. You get to make sure that the divisions make sense, that you did not miss one letter stuck to another, and you get to see the finished product and say, "Yes! I did that!"
I gotta say it felt good to see it all refoldered, defastened, and boxed up and labeled.
Good times. See ya next week!
Wednesday, August 13, 2014
Day 10: DISASTER
As is the case with all well-made plans, something always goes wrong. In this case, the culprit is a server and the catalyst is a rain storm.The result is a down server that IT could not get back up and running.
The plan for today was to update and correct some Subject tags in the draft finding aid in ArchivesSpace for the CGC collection.
Instead I worked on a tutorial for creating Agent records in ArchivesSpace and finished refoldering the CGC collection I had begun. All little things that needed to be done anyway at some point in time.
The plan for today was to update and correct some Subject tags in the draft finding aid in ArchivesSpace for the CGC collection.
Instead I worked on a tutorial for creating Agent records in ArchivesSpace and finished refoldering the CGC collection I had begun. All little things that needed to be done anyway at some point in time.
Friday, August 8, 2014
Day 9: ArchivesSpace
Today was my first introduction to ArchivesSpace, a new tool for creating EAD Finding Aids.
As this is a new tool and a new user community, the software is free, but the community membership and help support is a paid subscription. Below is an introductory video outlining how to begin setting up ArchivesSpace for your repository.
Using the Clarence G. Campbell Collection I have been processing, we are experimenting on how to create a finding aid in the tool.
Thursday, August 7, 2014
Day 8: Creating a Finding Aid
"Proof of a good finding aid is in the finding."
A good Finding Aid is clear, concise and organized. All finding aids within a repository follow the same format, structure and terminology. All sections are clearly labeled and defined.
In general, the finding aids have an administrative section, an introduction, a scope and content note, overview of the folders/record series, and some kind of inventory. All of the work that went into the preparation for processing the collection comes together and is incorporated into the finding aid.
The processing plan can be reworked into the scope and content section, the background research is incorporated into the abstract, introduction and historical note.
To view the traditional Finding Aid created, please follow this link:
For more information on CSHL's digital archival holdings, check out their Archives page and select Digital collections from the navigation box on the right of the screen. You can also access their blog and view their other archival initiatives from the link above.
To go directly to their digital holdings and finding aids: CSHL Archives Digital Collections.
To go directly to their digital holdings and finding aids: CSHL Archives Digital Collections.
Friday, August 1, 2014
Days 6 & 7: Processing Plan - Series Description
The basic types of documents found in the typical personal papers collection are manuscripts and correspondence. Clarence G. Campbell was no different in this respect. The fact that he was a doctor and eugenicist adds an additional layer of scientific knowledge to his collection; however it does not alter the basic record series present.
The first nine folders of the collection are comprised almost exclusively of manuscripts and research on eugenics topics. Folders 1 through 5 focused on early eugenics while folders 7 through 9 focused more on later eugenics topics and his book Race Survival. Folder 10 was comprised entirely of correspondence dealing with either the Eugenics Research Association, the Eugenics Record Office, or commenting on research and manuscripts (both his and others'). Folder 11 contained only published materials (mostly Clarence G. Campbell's) on eugenics.
As such, the container organizational scheme was pretty clear:
The first nine folders of the collection are comprised almost exclusively of manuscripts and research on eugenics topics. Folders 1 through 5 focused on early eugenics while folders 7 through 9 focused more on later eugenics topics and his book Race Survival. Folder 10 was comprised entirely of correspondence dealing with either the Eugenics Research Association, the Eugenics Record Office, or commenting on research and manuscripts (both his and others'). Folder 11 contained only published materials (mostly Clarence G. Campbell's) on eugenics.
As such, the container organizational scheme was pretty clear:
- box one: Early Eugenics Career
- Folder one (topical organization) - Eugenics and Race (1933-1936)
- Folder two (topical organization) - Eugenics and Breeding/Marriage (undated)
- Folder three (topical organization) - Eugenics and Evolution (1929-1932)
- Folder four (topical organization) - Eugenics and Controversy (circa 1930)
- Folder five (topical organization) - Eugenics and the Birth Control Controversy (undated)
- box two: Later Eugenics Career
- Folder seven (topical organization) - Eugenics and Dysgenics (undated holographs)
- Folder eight (topical organization) - Eugenics and Sterilization (1936-1938)
- Folder nine (topical organization) - Eugenics and Race Improvement (undated)
- box three: Correspondence and publications
- Folder ten (record series organization) - Correspondence (1929-1933)
- Folder eleven (record series organization) - Published articles
Subscribe to:
Comments (Atom)