Data Visualization: Methods for “Bicycle Crashes Rise in Boston and Cambridge as Cyclists Call for Change”

The project focuses on bike lane infrastructure in the Greater Boston area, specifically zeroing in on how failures to separate bikes from motor vehicles on roads leads to a higher cyclist accident rate. To show this comparison, the visualizations outline how advanced bike protections and developments are, and include a density heat map that shows bike crash hotspots across Boston and Cambridge and a scatter point map depicting annual crashes. After scraping data made publically available by each city, we compiled several datasets spotlighting the main conclusions we gathered from the data: there has been an increase in bicycle accidents since the COVID-19 pandemic, most of the accidents happen where there is no bike infrastructure or bike lanes not separated from the road, and many of the roads and intersections where crashes occur are busy streets and areas. The reporting bolstered these findings, as several community organizers, including the nonprofits attached to the Ride for Your Life event in Cambridge this story’s narrative is anchored in, have been demanding motorist accountability for pedestrian and cyclist fatalities for years. Automated enforcement measures, like sensor devices that photograph vehicles committing camera-enforceable violations, are being lobbied for by cyclist advocates state-wide. Critics say the legislation challenges the autonomy of motorists and police officers, but the findings of this report support the idea that if Boston and Cambridge invested in bike lane infrastructure, residents would be more motivated to use public transportation. While the project is not a body of work meant for advocacy, we acknowledge that it could be used as a resource by these groups to support their call for stricter regulations.

This story began as an attempt to see if there was a pattern in the neighborhoods where bike crash fatalities were occurring. It developed into a more robust retelling of the history of the advocacy work to bring awareness and safety precautions to bike lanes after we scraped the data and found that, on average, the annual number of bike crash fatalities has continued to increase since 2020. Initially, we thought the data would yield information about bike crash hotspots, which was correct to an extent, as we logged several neighborhoods that consistently had reported crashes. We modeled our visualizations after these findings, and thought to complement them with first-person testimonials from those affected by this issue. That led us to the Ride for Your Life event in Cambridge, where we spoke to several event organizers and residents from 25 different cities and met victims of bike crashes that could personify the data points. Most people were forthcoming with personal details about their crashes, or were eager to voice their support for automated enforcement measures, with many of them citing the pending legislation. That factor made it imperative to include a brief breakdown of the legislation, including where it stands now and where it seeks to progress. The data visualizations aim to give a sweeping overview of Boston and Cambridge’s bike lane fatalities year-to-year, and identify which areas present the most danger to cyclists.

The bicycle accident data came from Analyze Boston using dispatch records and the City of Cambridge Police Department. The data from Analyze Boston is updated bi-annually, so there was only data from up to June 30th, 2024. The Cambridge data is updated weekly, but we used up until October 31st, 2024, when we first found the data. Each dataset had what type of vehicle was impacted in the accident, so we separated the bike accidents from other vehicles and put them on a different sheet in Excel. After looking through each of the datasets, we discovered that the Cambridge data did not have any geotag locations, so we manually entered each one using the intersections and addresses provided with the data. Some of the data only had a street name, so we used the location of where Apple Maps puts the street. We then combined the Boston and Cambridge data into Excel with what was provided in both sets: the date, latitude and longitude, the street intersections, and the addresses if the accident didn’t occur at an intersection.

Using Tableau, we created several visualizations to enhance our story. First, we made a map which has all 2,287 bicycle crashes color coded by year. We originally tried to use data from as far back as 2016, but there were almost 4,000 data points and it heavily lagged our computer. We then decided to use data from the past five full years and this year because it was an even number and it would show crashes since Cambridge passed its Cycling Safety Ordinance. Color coding it by year allowed us to see that 2019 and 2023 had the most amount of accidents. When someone hovers their map over a point, they will see the day of the accident and they can move around the map to look for a specific location. There is a legend where each year is assigned a shade of blue and a section where a person can select a certain year and find crashes from that specific year. We chose blue because it was a distinct color that wouldn’t be on a street map. 

The second map has the same accident locations, all as one shade of blue, with the bicycle infrastructure on the street on top. The infrastructure data came from the Massachusetts Geospatial Data Hub and was updated January 30, 2024. Using Tableau, we connected the data by creating a relationship between the infrastructure and accident data using the id numbers. We selected the extract connection option, so all of the data would join. Using the accident data, we created a map and added the latitude twice to create two separate maps. Removing the accident data from the second map allowed us to add the infrastructure data there and combine the maps by using the dual axis feature. We used shades of red for the types of bike infrastructure because it was not on the street map and it was clear to see the blue accident dots compared to the red. 

The third map we made was a heat map, where we did the same thing as the first map except selecting density instead of the automatic points. We chose blue again to represent the heat data and it shows where the most accidents occur, allowing us to use that data in our story. 

The graph on hospitalizations was created because of how we wanted to share the story in unison with the interviews conducted. Only Cambridge had data on injuries and hospitalizations because Boston does not record the severity of the crashes to protect the privacy of individuals, according to the data description on Analyze Boston. Cambridge only records up to hospitalizations, which is estimated. We learned that 2019 and 2022 had the most hospitalizations and made a table to determine the number of injuries per year to add as context to the article. We again went with blue for the color to keep a consistent theme throughout.

To create a standardized method of comparing cyclist fatalities across different urban environments and understand where Boston and the greater Boston area stand from a national safety perspective we collected cyclist fatality data from the National Highway Traffic Safety Administration's 2022 Traffic Safety Facts Annual Report Tables, focusing specifically on cities with populations over 500,000. Turning raw fatality numbers into comparable metrics distinguished itself as the key challenge. However, this normalization process is critical, eliminating population size discrepancies, and allowing for an accurate comparison. 

For example, if Boston had 8 cyclist fatalities in a population of 675,647, the normalized rate would be (8 ÷ 675,647) × 100,000, resulting in 1.18 deaths per 100,000 residents. By applying this consistent method across all cities, we created a precise comparison of cycling safety across different urban landscapes.

Once normalized we entered all collected data into Excel, cleaned and organized it into two tables, and uploaded it to be visualized in Flourish’s Interactive Scatter Plot tool. In this fashion, the arrangement of the data revealed, to our surprise, that Boston stands out as remarkably safe among major U.S. cities for cyclists. This graph offers a broader, national perspective while maintaining the integrity and importance of our stories' narratives.  

We utilized Knight Lab's StoryMap JS to create an interactive narrative exploring cycling fatalities in Boston, demonstrating how individual tragedies reveal persistent infrastructure challenges even in one of America's statistically safest cycling cities. The mapped points allow viewers to move through space while engaging with multiple layers of information: the specific infrastructure challenges at each location, the personal stories of cyclists lost, and the policy changes that followed - or in some cases, the warnings that preceded - each incident.

This process began with research into cycling-related deaths in Boston, focusing particularly on cases where infrastructure gaps played a role. For each mapped location, we documented not just the circumstances of the accidents, but delved into the lives of those lost working to emphasize the importance of every live lost from a tragic accident. (A critical pattern emerged during the research phase: many safety improvements came as reactions to fatalities rather than preventive measures. For example, Eric Hunt's death, from an MBTA bus, in 2010 led directly, within the month,  to new MBTA training protocols.) 

The three maps effectively communicate the intention of the data with the reader on a geographic scale. This article, while not a solutions-oriented piece of journalism, describes several pieces of legislation lobbyists are attempting to push through the Legislature, which made it necessary to include an infographic that breaks down technical jargon into more accessible language. A simplified data visualization of the printed words effectively displays and communicates the intended information, which is crucial to the success of this project because of how far into the article the introduction of the legislative measures is. The reporting largely complemented the findings of the publicly available data. Because this is an issue that many Massachusetts residents care about (over 350 people from 25 cities showed up to the memorial ride in November), cyclists and pedestrians were eager to discuss reform options. This story exists because of the cooperation of the community in telling it. 

Alex Lott led the data visualization components of this story. They were instrumental in researching, scraping, cleaning, and presenting the data, as well as finding the evidence used to come to the story’s conclusions. They created several maps for this project’s use: namely the plot point map depicting the sites of bike crash fatalities, the map with accidents and bike infrastructure, and the heat map showing the density of crash hot-spots. They also contributed to the reporting and writing of the narrative and methodology. Jack Kaplan created the multimedia components and infographic visualizations within the story. He photographed several bike crash sites, and designed both the infographic that breaks down several pieces of cycling safety legislation and the scatterplot showing Greater Boston’s place within national trends. He also contributed to the reporting and writing of the narrative and methodology. Adri Pray led the reporting initiatives and assisted in finding data for the visualizations. She attended the Ride for Your Life memorial ride on Nov. 17 and spoke with several cyclists and organizers about their efforts to raise the public’s awareness of roadside fatalities. She largely wrote the narrative feature and the methodology.

Read the story “Bicycle Crashes Rise in Boston and Cambridge as Cyclists call for Change here.

Download the data set on Boston and Cambridge Accidents here.

Previous
Previous

Kluivert’s scoring prowess continues as Bournemouth beat Newcastle

Next
Next

Data Visualization: Bicycle Crashes Rise in Boston and Cambridge as Cyclists Call for Change