A Block in San Francisco

a digital humanities project using 1900 census data

Which one?


Between Market, 7th, Stevenoson, and 8th St in San Francisco, today you can find a coffeeshop, a grocery store, a phone shop, a tobacco shop, a library, a dance studio, a theater, an office building, a few government offices, and a few non-profits. At a first sight, this block off of the busy Market St is hardly residential.

Use the Google map below to explore and see for yourself.

Let's go back to 1900

Photo credit: D. H. Wulzen Glass
Plate Negative Collection (Sfp 40),
San Francisco History Center,
San Francisco Public Library.

The city


Back in 1900 the city's map looked a little differently. In 1906, following the 1906 San Francisco earthquake and fires, the city underwent reconstructions and changes. The 1905 Sanborn fire insurance maps atlas show what the street plan of the city looked like before the 1906 disaster.

Hover over the map to take a closer look and explore what the block between Market, 7th, Stevenson and 8th St was like back then.

The residents


Before San Francisco became the tech hub it is today, Market St was a bit more residential. The 1900 census data recorded at least 154 permanent residents in our block, capturing their year and place of birth among other information. The map below shows their homes. You'll notice the pins don't fully match current building or house numbers - most buildings don't even exist at all.

The pins on the map above are color-coded based on homecountry; click on any pin to learn more about the specific resident.

  1. I digitized a few sheets from the 1900 census, with the help of Lea Henaux.
    1. The person responsible for collecting census data of this neighborhood (Archie L. Hyde as indicated on the sheets) had cursive handwriting, which is at times hard to read. I used this cursive style as a reference to resolve unclear words and letters.
    2. I kept transcripts as close to the original as possible. Illegible entries are marked [illegible]. Empty cells are kept empty. Name spelling is preserved.
  2. I used the the 1905 Sanborn fire insurance maps atlas to filter out the house numbers that are not part of the block.
  3. I georeferenced the Sanborn map and the residents' addresses using QGIS. As most georeferences were inaccurate, I manually changed them to match the houses on the Sanborn map.

Let's step back and look at the bigger map

Year & place of birth

The map above displays the birthplaces of the residents in the block in 1900. You can adjust the time period with the slider and get information on specific dots by hovering over them.

  1. I georeferenced the places of birth using Awesome Table's Geocode plugin for Google Sheets.
    1. For countries, Geocode picks the center as a point of reference, using modern day international borders. Due to limited information and changing borders, the points do not mark the precise place of birth but give a general location.
    2. Many of the residents were born in the same places, resulting in overlapping points. This makes the map hard to read and obscures the actual number of residents. To patch this issue, I implemented an element of randomness to the georeferenced coordinates, adding a random value between -1 and 1 to each. This makes for a negligible shift along the lattitude and longtitude of location and the point is still within the original country, capturing the place of birth with the same level of precision.
    3. Bohemia does not exist as a country anymore, and Geocode could not georeference it. In 1900, Bohemia's capital was Prague. I manually georeferenced Bohemia entries to point to Prague.
    4. I manually changed Russia's georeference to Moscow. Geocode placed Russia in Siberia (middle of the country), but my educated guess is that Russian immigrants came from the west of the country. On another note, Plot.js visualizes links between 2 points as the shortest path. The link between Siberia and the US (in the Connecting Families map) goes over the Pacific Ocean. This creates an inaccurate representation: white immigrants in 19th century reached the US through the Atlantic; San Francisco became an important immigration port only in the 20th century with its infamous Angel Island Station established in 1910. Hence, "moving" Russia to Moscow makes the map a better historical representation.

Shortcomings


In the 1900 census data, places of birth are recorded as countries, covering large and demographically diverse regions. The georeferencing tool used to produce these maps handles this lack of precision by placing the country in the middle — of where it is today. There are a lot of problems and workarounds to make georeferencing or at least visualization better — a cheap-and-easy solution is to use bigger markers that cover a larger area to represent the vagueness — but we cannot go beyond what is available as information. Historical sources are often incomplete and fragmented, while working with data implies pin-point accuracy.

Another issue with this visualization is that the slider implies some dynamism, while the dots remain static. This arises from a lack of information, as we only have knowledge of the residents' birthplaces and their location in July 1900. The time in between remains uncertain. Although the census contains the year of immigration to the US for some of those born outside the country, it does not capture their point of entry. We can fill in some gaps through inductive reasoning.

Census data: what does it tell us?


The 1900 census provides both explicit and implicit data about the block's inhabitants. Explicit data refers to information that is readily provided either by the census official (Archie L. Hyde), such as address, or by the individuals themselves upon inquiry, such as name, place and year of birth, occupation, etc. Implicit data is anything else that can be inferred from the data. For example, based on a resident's known month of birth, we can deduce their zodiac sign (did you notice on the map above?) Although this kind of information may not reveal much, it serves as a useful example of deductive inference from the available data.

Inductive reasoning is more likely to produce erroneous conclusions but is also potentially more enlightening when working with census data. Based on occupation and location, we can infer annual income; based on country of origin, we can infer about religious and cultural belonging; and so on.

Going back to the map above, we can track families's locations throughout the years based on their children's place and year of birth.

Connecting family data

The map above allows you to track families' known locations prior to 1900. Use the drop down menu to choose a family and the slider to adjust the time period.

  1. I organized residents into families, based on (in order of consideration) relationship to household head, last name, marital status, gender and year of birth.
    1. A family consists of "head" (typically male but can be a widowed woman), "wife", "son", "daughter". Other relationships like "nephew", "step-daughter" or "brother-in-law" are also considered part of the family.
    2. The 1900 census data assumes there can be only one head for each household. If there are 2 or more families living under the same roof, only one family (presumably the house owners) are listed as family members. All others are marked as "lodgers".
      1. I used the combination of last names, gender and marital status to inductively determine couples. For example, two consecutive entries with the same last name, both married, one male and one female and with an age difference of less than 15 years are likely a couple. This inductive inference rests on 2 assumptions: (1) only heterosexual marriages were legal in 1900, and (2) wives would adopt their husband's last name.
      2. Extending one the above, I used last names and parents' place of birth along with year of birth and sex to determine sons and daughters in the family.
      3. It is possible, of course, that the "children" are nephews or the adults are not married but have coinciding last names. My assumptions are made following the Occam's razor principle, favoring the simplest possible explanation.
      4. For transparency, I preserved the "lodger" label along with the presumed role in the family. For example, L. Landram is marked "lodger, likely wife"
    3. I gave each family an integer code. Single individuals with no assumed family belonging are marked as separate famililes.
  2. I filtered out families with less than 3 members, excluding single people and married couples with no children living with them.
  3. I plotted families known locations using the georeferenced places of birth and year of birth, assuming the father was present for the birth of each child (it is logically necessary the mother was present).

All roads lead to San Francisco


The residents living inside the block of Market, 7th, 8th and Stevennson St were a diverse (all white) group. The grouping by families aims to capture the different paths they took to reach the same block in San Francisco.

Both maps combined suggest that most residents of the block in 1900 moved to (or were born in immigrant families in) California after 1860. This coincides with the rapid population growth of San Francisco during the Gold Rush era.

Another map, more shortcomings


The general trend of the family routes above is moving westward, meaning most likely arrived in the US crossing the Atlantic Ocean. On average, the 5000 km trip across the ocean took roughly 6-10 days. California, and the West Coast in general, was another 5000 km away - a journey that without any stops along the way could take weeks or even months.

The map above is clearly incomplete. The limited information results in very few arrows across big distances, more representative of modern-day flights rather than the long routes along roads, rivers and seaways across 19th century Europe and North America. It is possible to fill in some of the location gaps with information from other resources like censuses from different years, immigration ships passenger records, other archives, etc. Personal diaries, memoirs, family histories and others can make the narratives more human. Multilayered, interlinked data would allow us to situate immigrants' backgrounds and experiences.

Continue exploring

My sources


You can download the filtered & scrubbed dataset (150 entries, only selected columns and intials instead of first names) used for the maps above directly from my Observable notebooks.

Find the full transcribed dataset (350 entries) here, available for Minerva University students and faculty, or upon request.

Other sources


The 19th to 20th century immigration to the US is fairly well-recorded. There are many (far more extensive than mine) digitized records and archives of immigrant arrivals in the 19th century, such as:

  • The Ellis Island Passanger Database contains records of over 60 million passengers who arrived at the port of New York in the 19th and 20th centuries.
  • The Immigrant Ship Transcribers Guild is a database of 26 000+ entries from multiple ports. The organization is completely operated by volunteers, and they are open to new members.
  • The National Archives has an index of digitized collections by different organizations. Search keyword "passengers" to find relevant databases.

Where to start?


Google & most databases require specific inquiries to retrieve data. If you don't know where to start, here are some challenges for you.

  • French-born taxidermist Ernest Lorquin lived in 1195 Market St. Where was his shop?
  • The 1900 census data alone allows us to track families locations throughout the years, but we can do the same for individuals by combining multiple sources. Can you map John Spade's journey from Russia to 1139 Market St?
  • Margaret Raleigh, mother of 3 and head of the 652 Stevenson St household, was born at sea in May 1855. Try to track her family's voyage to the US. What social norms and power structures make this harder (e.g. compared to John Spade's case)?