Obsessed with Boba? Analyzing Bubble Tea Shops in NYC Using the Yelp Fusion API

In this workshop, we explore and develop insights about NYC’s Bubble Tea Shops using the Yelp Fusion API. Sections include:

  • How to use the Yelp Fusion API,
  • Data Cleaning, Wrangling and Visualizations in Python,
  • A demo of our web app created in Jupyter Book and Streamlit

Additionally, questions we explore include bubble tea locations, Yelp ratings, review counts and price.

After an initial introduction of each section, participants will join break-out groups depending on which topic they would like to learn more about. These break-out sessions will be hands-on and interactive. Participants will then reconvene for a Q&A and final thoughts. Attendees will gain a better understanding of the data analysis workflow and will leave with skills and a template to uncover insights with any dataset.

This workshop recommends beginner-level proficiency with Python and is focused on applying Python to data analysis; however, those new to Python are gladly welcome!

Tools included in the workshop include:

  • Python
  • Jupyter Notebook
  • Streamlit or Dash
  • Yelp Fusion API (this is a free API that only requires registration)

A preview of the workshop can be found on GitHub here: https://github.com/mebauer/boba-nyc. For more information about requirements and library dependencies, please visit our GitHub page under the section Dependencies.

Between Physarum lattices and NYC recycling data

“3d printers add materials in layers; nature doesn’t. Nature grows.”- Neri Oxman, an Israeli-American Designer and professor at the MIT media lab.

Rather than exploring how multiple materials can assemble an object, our exploration started by looking to nature and searching for a model that bears growth using a single material.
We explore the Physarum. It is a slime mold popularly known as “the blob.” Our research consists of establishing a mathematical model that describes the growth and decay of the Physarum over time. The model is then harnessed to produce data visualizations using decay, density, color, speed, 3d space, and time to portray a rich, insightful picture.

The proposed framework would allow artists and designers to leverage a unique generative model we developed. We will showcase the model on NYC recycling data. Our exploration will help users understand how much we are contributing to the communal effort of recycling and keeping our planet green and clean and improving our waste management and daily habits.

The A-Train to Knicks Pain: Social Media Analysis of NYC Basketball Fans

If there’s one thing about New York, the basketball fans here are generally more “passionate” than most. Unfortunately, that passion can turn to pain, especially as a Knicks fan. They have often been disappointed over the course of the last four decades, which brings us to this project. We at “Rambles from The Garden” aim to understand how New York Knicks fans really feel about their team, players, and coach, through analysis using the coding languages Python and R. We will pull hundreds of thousands of Knicks-related tweets to create time series, lexicon, sentiment, and other such analyses.

By pairing Twitter data with NBA stats such as wins, losses, player’s points per games (PPG), etc., our end goal will be to gauge the reactions of the fanbase across social media with a fine lens (for instance, getting a grasp on the Knicks fans’ love-hate relationship with their former All-Star, Julius Randle).

We’re also going to have a little fun by trying to settle who has the better fan base in New York City: Knicks or Nets by doing a comparison of tweets to see who has a generally more positive response to their respective team(s).

Low-code visual data exploration with NYC public data

Have you ever felt trapped by only having a few chart types to choose from in your spreadsheets, and not having the time, budget, or experience to realize the visualization of your dreams?

In this session, you will learn to use free web-based and open source low-code tools from industry and academia. When these tools are combined, you’ll have a powerful toolkit be able to uncover and organize visual insights from relatable public NYC datasets in dozens of creative ways.

Featured Datasets:

Featured Tools:

  • Graphpad (Vega-Lite) and Excalidraw (Free and open source) plugins for Figjam (closed source, but has free plan)
  • Rawgraphs (Free and open source)
  • Voyager2 (Free and open source)
  • Kepler.gl (Free and open source)
  • Sanddance (Free and open source)
  • Datasette (Free and open source)

For best results, please join the session from a computer. These tools will not work as well from a tablet or mobile phone.

Data Cleaning Techniques

You’ve got your data loaded, you start on your analysis, and… WHAM, missing values. WHAM, junk entries. WHAM, capitalization inconsistencies.

Data cleaning often feels like a chore, and we will often do as little as necessary. What if we took a more systemic approach? In this hands-on workshop, we’ll explore some common data issues to look for, tools, and techniques for cleaning it up, giving us better understanding of our data in the process and clearing the path for smoother data analysis and manipulation.

The session will be led by Aidan Feldman, who has been a technologist working with government and non-profits for the past seven years. He also teaches about code and data to public policy students at NYU.

Checkbook NYC Advanced Searches

This will contain an overview of the  Smart Search, Advanced Search, Narrow Down Faceted Search and Datafeeds search mechanisms within Checkbook (https://www.checkbooknyc.com/). The objective of this session is to expand user understanding of what type of search technique is best suited for your query. We will use some sample searches and take actual user examples to understand how to refine your inquiry. We will also review how to gain insight from the contract detail pages.

Pay Equity in NYC

Data Scientists from the New York City Council will discuss how pay equity was assessed through a statistical investigation of data covering the NYC public workforce. Issues with the data will be addressed, and the modeling explained and justified. Outcomes of the analysis, and potential impacts will be covered.

Getting to Know Checkbook

This session will include a broad overview of the type of data that Checkbook (https://www.checkbooknyc.com/) offers and a guided tour of the site’s functionality. We will walk through the Spending and Contracts domains; the M/WBE and Subvendor Featured Dashboards; review the various search capabilities; ways to export data and creating customized alerts.

The Tip of the Iceberg of Domestic Violence

Last year’s stay-at-home orders saved an untold number of lives. But they also may have fueled an increase in domestic violence in the United States. Learn about student work at Cornell Tech leveraging NYPD domestic violence complaints data to assess trends and patterns of domestic violence in New York City.

This session is organized by Cornell Tech’s Urban Tech Hub.

Moderated by: Anthony Townsend

Presentations by:

  • Preksha Agarwal
  • Eesha Khanna
  • Jenny Liu

Measuring Equity using Open Data: Building Evidence Based Policy and Reparation Advocacy

Learn how NYC Open Data can help community advocates, capital providers, and policymakers quantify the harmful impact of historical redlining by banks and other private-sector financial services companies.

A detailed case study will highlight how open data cross-pollinates local reparations initiatives include Evanston, IL where open data is a critical input to building a groundbreaking reparations program.

The session will include a ‘get started’ action plan, best practices and other resources for anyone interested in housing, educational, environmental and healthcare reparations work in jurisdictions around the country.

Presentation Moderator:

David Scatterday, Managing Director, Scatterday & Associates


  • Enith Williams
    Executive Director, Reparations Finance Lab
  • Robin Rue Simmons
    Founder, First Repair
  • Linda J. Mann
    Adjunct Associate Professor, Columbia University, School of International & Public Affairs