What’s an API and Why Should I Care as an Activism Researcher?

Notes and reflections from the Community Data Science Workshop, presented by Benjamin Mako Hill and friends today at the University of Washington, Dept. of Communication.

Contents

  • Introduction
  • Using sample APIs
  • Putting data in a format you can use

Introduction

What is an API?

  • Term stands for Application Programming Interface
  • It is a standard (or protocol).  It’s not a piece of software.
  • It’s a way for one program to talk to another program.
  • It’s a way to get data from online platforms about what people are doing on that platform.

Why should I care as an activism researcher?

  • Sometimes people are using those platforms for activism.
  • You can learn something (not everything) about activism activity on the platform by looking at the traces of that activity that individuals leave in the form of tweets, follows, edits, and more.
  • These traces are the data that you access through the API

What can I do with an API?

  • ask for data (almost always asking a URL)
  • get data back (almost always in a file format called JSON)
  • build a dataset of content to study (using the data you got through the API)

Challenges

  • When a platform changes its architecture, the structure of data can also change.
  • Different platforms have different APIs.  Though they have similar features, you will need to learn each platform’s API separately
  • The API structure and documentation will be better for platforms that make money off their API, like Twitter
  • For those that don’t care about how people access their data, the API will not be well-structured
  • However, for platforms whose data is commonly accessed via API (like Twitter), there will be existing Python modules that have been created to make your task easier.

Using Simple APIs

Use Python to go out onto the web, grab some data, and show it to you

  • Use Python, which is a program you probably already have in your computer and can access with your terminal (here’s how)
  • Python example code, which scrapes the HTML code from the website http://www.python.org

Screen Shot 2014-05-03 at 12.24.54 PM

  • In this example, the API is the standard that allows you to pull data (HTML or other) from a website by using the code above.

Put data from the web into a file on your computer

  • This code puts the html code from http://www.python.org into a file on your computer

Screen Shot 2014-05-03 at 12.14.02 PM

  • Now the file is on your computer (you can find it by searching your computer for a file named python.html)
  • Use the command os.chdir to set your directory, which is the place on your computer where the file is placed
  • The default directory is your personal directory on your computer.  For example, mine is called mjoyce.

Use a simple API that involves kittens

Screen Shot 2014-05-03 at 12.26.22 PM

  • And here’s what the file you created looks like on your computer

Screen Shot 2014-05-03 at 12.29.44 PM

  • In this example, the API is the standard that you can get an image of a kitten with certain dimensions by using a URL with the dimensions at the end of that URL

Putting Data in a Format You Can Use

Why should I care about JSON?

  • When you get data from a website using an API, it will most often be in JSON format.

What is JSON?

  • It is a language for structuring data.  It is a format used by programs for programs.
  • Here’s an example of what a JSON file looks like: http://json.org/example.html
  • Here’s a simpler example by Mako: http://mako.cc/cdsw.json
  • It has a very similar data format to Python

Interpreting a JSON file

    • This is an entry about a pet fish with a name, age, and favorite color

Screen Shot 2014-05-03 at 11.41.43 AM

Importing a JSON file into Python

  • This code displays a JSON file from the web in Python

Screen Shot 2014-05-03 at 11.45.35 AM

  • This imports a file for interpreting JSON into Python.

Screen Shot 2014-05-03 at 11.47.15 AM

  • This code names the json file “data” and displays the contents of the file in Python

Screen Shot 2014-05-03 at 11.55.06 AM

Putting data from a JSON file into a spreadsheet

  • This code puts certain data in the JSON file from the web into a .csv spreadsheet file so it is easier to work with.

Screen Shot 2014-05-03 at 12.03.23 PM

  • And this is what the .csv file you created looks like

Screen Shot 2014-05-03 at 12.00.48 PM

  • These are the basics of what you need to do to be a data scientist, either for the study of activism or any other activity carried out online.

One thought on “What’s an API and Why Should I Care as an Activism Researcher?

  1. Pingback: How to Get Data from Twitter | meta-activism

Leave a Reply

Your email address will not be published. Required fields are marked *

 

Proudly powered by WordPress
Theme: Esquire by Matthew Buchanan.