Saturday, December 1, 2018

Learn Python and APIs with Open Weather Map in 5 Mins

My motivation for this post is to show how easy it is to start programming. Too many people are mystified by what programmers do, when in reality anyone can learn a little programming and automate simple tasks in their lives.

Here I use Open Weather Map to get weather data for a list of US cities. I use Python since it is one of the most readable and easiest languages to get started with, and I use the library 'requests' to simplify getting data from an API. The whole code is about 20 lines including comments, and you can run it yourself here without downloading anything! Here is the full code if you just want to jump in:





Topics: Python, APIs, HTTP Requests, Arrays, For Loops, Conditionals, and More!
Python
I love Python for its simplicity and readability, which helps when I need to quickly put together a prototype. Many other languages, like Java, C++, Node.js, etc. are more verbose and so can often take longer to write and understand. While these other languages have their own benefits, Python is very powerful and a great starting point for non-developers. Whether you want to do data analysis, machine learning, or web backends, Python is a great tool.



APIs
API stands for Application Programming Interface, which is just nerd talk for "a simple way to communicate with a server." Its the building block or backbone for most of modern programming. Want to add Google Maps to your site? Use Google's API. Want to build a Bitcoin wallet app? Try Blockchain's API. Want to build an automated email campaign? Check out Sendgrid's API. Want to build an object tracking computer vision app? Check out OpenCV. Want to build a machine learning app? Try Tensorflow's API. Want to creep on your friends? Just use Facebook's API.

The documentation is usually all you need.

Want to get stock price data? Try Intrinio's API. Want to track the number of clicks a link gets? Check out Bit.ly's API. Want to make a bid on eBay within seconds? Check out their API. I could go on and on and on, but I'm just as annoyed as you are.



HTTP Requests
Most APIs require making an HTTP request over the internet. In Python, you can use a simple library called requests. A library is just someone else's code that makes your life simpler. Instead of having to write the code yourself, someone else took hundreds of lines of code and allow you to only need to run one line. This is part of why it is infinitely easier to code in 2018 than in 1998. At the top of your code in Python you "import" libraries that you plan to use. Many APIs return the data in JSON format, which looks like this: "name":"Pepito""age":30"car":null }

In our case, the HTTP Link we want to use is Open Weather Map's. The base link is http://api.openweathermap.org/data/2.5/ We can add what are called parameters to specify what kind of data we want to request. Open Weather Map's API has a ton of options. We will use 'weather' to get the current weather data, as opposed to 'forecast' for the 5 hour prediction.

API Key Security
For every API you use, you will need an API key, which is basically your unique password. In general, you should never share your API key with anyone. Once a developer accidentally shared my Google Cloud Key on a public Github Repository, and someone proceeded to use my key for mining cryptocurrency. Fortunately, Google detected and reimbursed my mistake within 24 hours; otherwise I would have been charged $300 for cloud usage. Accidentally sharing API keys is also how a hacker store 50 million user's data from Uber! Additionally, you should try not to store API keys in plain text, but instead use environment variables. That's beyond the scope of this post, but if you are working in an security intensive application, make sure to read a lot more and consult with experts. In this case, I am using a free API key that limits 60 requests per second, and I am trying to avoid you needing to create an account, so hopefully no one abuses this key. If they do though, I will have to remove my key and replace it with "putYourAPIKeyHere."

Logic: Variables, Arrays, For Loops, and Conditionals
If you just learn these simple concepts, you will be well on the way to learning programming. A variable allows you to store the same word, integer, or other information to be used later in your program. For example, we use weather = r['weather'] to store the responses' weather attribute and later print it out. Printing is what spits stuff out in the console for you to read. We also use variables to adapt the HTTP Link we want to call

httpLink = 'http://api.openweathermap.org/data/2.5/'+ parameter+'?q=' + city +',us&appid=' + owm_api_key

Arrays are the basically multiple variables- they allow you to store infinite information in any format. We use cities = ['New York', 'Denver', 'Atlanta'] to store an array with the cities we want to analyze. This leads to the concept of 'for loops.' For loops are used if we want to iterate (or go one by one) through an array in order to execute a specific piece of logic for each input. In our case, we want to make a weather request for each city in our array. Finally, just to show the concept of conditionals, or if statements, I add a print statement if city == 'Denver'

From each of these concepts, you can start to build apps. Web APIs allow you to build off of other companies' libraries, Python makes it easy to start programming, and learning a little logic guides your intuition as a programmer. No doubt, there is much more to learn to be a great programmer, but this just shows that in a very short period of time you can learn how to program and build off existing concepts. 





Saturday, November 24, 2018

Captcha, Recaptcha, Postcaptcha (4 to 7 Minute Read)


Completely Automated Public Turing Test To Tell Computers And Humans Apart (CAPTCHA) is sometimes referred to as a reverse Turing Test, since a computer is trying to determine if you are human as opposed to the other way around. In the 1950s, Alan Turing proposed a test to determine if a computer could trick a human into thinking it was human: "Are there imaginable digital computers which would do well in the imitation game?," aka imitating human behavior. CAPTCHAs, on the other hand, are trying to get you to prove you're human. The ostensible purpose of these tests is to reduce spam from online bots and otherwise improve digital security.

Captchas suck!

The CAPTCHA Arms Race and Inaccessibility
In so far as there is an incentive to spam or find security vulnerabilities (which there always will be), there is an incentive to try to crack the best CAPCHAs. Earlier versions of CAPTCHA were fairly easy to trick by training simple machine learning models. This leads to a indefinite arms race, where CAPTCHA models improve the difficulty levels so bots can't easily crack them, and the bots retaliate by improving and eventually cracking the newest versions. The problem is that this arms race also makes it harder for humans to pass, and they are notoriously difficult for blind and deaf people. 




Google's response is ReCaptcha, an attempt to make CAPTCHAs "hard on bots, easy on humans." It uses pictures instead of words, since images are harder for a machine to interpret but easy for a human. This has the added benefit of advancing the field of machine learning, since Google can use our ReCaptcha answers to improve its products. 

Additionally, ReCaptcha is tracking behavior as soon as a user arrives on a site, looking at as relevant metrics as mouse movements, prior site visits, and the speed of the browser interaction. That's how ReCaptcha attempts to be invisible, but even so, it is surprisingly easy to beat.  



Beating Google with Google
Next time ReCaptcha wants you to pick some signs, trees, cars, street lights, etc., see if Google's own product, Vision, can beat it. You can simply take a screenshot and drag the image to the "Try the API" section for free without creating an account.



Here's my results from the above image, clearly indicating that Google knows this is a sign (from itself- God, I mean Google):



If the Vision approach doesn't work, there's also the audio approach: use Google's Speech Recognition API to break the audio challenge. It would also probably be easy to train a bot to emulate human mouse movements and speed, so inevitably this approach will also fail. Perhaps the next step is video clips with questions about context, but eventually Vision will have these capabilities too.

Google could introduce adversarial attacks to their own ReCaptcha, but that too is at best a short term solution. Actually, come to think of it, that would be an amazing way to improve Vision's API. Step 1: Trick Vision V1 using adversarial attacks which simply introduce static noise to an image like below in order to manipulate the confidence of the model's prediction. Step 2: Train Vision V2 on human solved Recaptcha puzzles with trained noise applied, since a human would still recognize a panda over a (what's a) gibbon. Vision V2 will then be more robust to adversarial attacks than V1 and repeat. Wait... Google, are you already doing this?





Retreat from the Battles, Face the Coming War
The fact of the matter is that eventually there will come a day very soon when a bot can perfectly emulate online behavior- beat any CAPTCHA, understand all images as well as a human can, emulate mouse behavior, etc. Indeed, training a machine learning model to beat any CAPTCHA, not only a particular one such as ReCaptcha, is a step towards general artificial intelligence. And one day we will get there, so how can we build systems that are robust against bots and preserve the ability to prove you are human? A few considerations will be crucial.


Burden on the User vs Burden on the Business
The reason I struggle with recycling is because I wish that trash and recycling companies could be better at filtering and automatically recycling my trash; why do I have to do all the work for them sorting my trash and recycling in advance? Likewise, many have argued that CAPTCHA approaches put too much burden on the user. Perhaps instead businesses should focus on building secure systems that are robust to spam and other bots. While there are likely improvements to be made on this front, it is an optimistic view. Many applications, whether financial, healthcare, or government related, will always need users to prove their humanity.


In the Long Run We're all Human
Eventually, we will need some way to digitally prove we are homo sapiens. That might start with text verification, but it will likely need a biometric approach.  For now, only humans have eyes, fingerprints, and human DNA. Yet, can we ensure this information won't be spoofed? That may prove impossible. An ideal PostCaptcha system, however, will likely use these metrics and others to conduct a reverse Turing Test.

Additionally, to ensure privacy protection and efficiency, these approaches will need to be anonymous, provide simple mechanisms to generate new users, but also remain difficult for a bot to abuse through emulation. It is possible that a centralized government based solution will be necessary long term; however, it is worth entertaining the possibility of a decentralized approach.




Another Blockchain Idea
Perhaps in the near future, there will be a better way to share private information with companies through a blockchain. The idea of a blockchain with time/ event based location permissions is an attractive one; why not extend that to all user identifying information?

Some websites may require lower security and only need a fingerprint, while others, such as financial institutions, may require IDs, selfies, proof of address, medical, and even biometric data. Users could be in charge of which companies and how much data to share, preserving anonymity and control.

Needless to say, a future PostCaptcha system may need much more comprehensive information about a person than we currently require to know they are who they say they are. Whether this approach or another ultimately succeeds, we need to start thinking hard about what a PostCaptcha world will and should look like.







Thursday, October 25, 2018

A Solution to Cap Hill Parking (14 to 19 Minute Read)

A Single Tear for Free Parking
Capitol Hill in Denver has some pretty crappy parking for anyone experienced in the matter. If you get off work at 6pm or try to visit from out of town on the weekend, you are SOL and might spend 15-30 minutes or more looking for a spot. I personally don't know a single car-owning person in Denver who hasn't got at least 1 parking ticket. I've got 3, and I don't even own a car! SWIM got lucky and avoided being towed by sprinting at the speed of Usain Bolt after abusing an unused private parking spot. My neighbor wasn't so favored by the universe; she had to pay $300.

Denver General Fund Revenue Report


Interestingly enough, parking fines are the 5th highest revenue stream for the Denver General Revenue Fund at $30 million, yet paid parking meters make up only one third of that amount at $10 million.  That amounts to 10,000 tickets per parking agent in 2015!



Why is parking in Cap Hill so bad? It's basic economics, manifested by the unfortunately not so mythical tragedy of the commons; if you charge nothing for a public resource, everyone tends to overuse those resources. Yet, even the idea of paid parking elicits disgust. A single tear is shed for the beauty of free parking.


z95ree7
Green is all the space used for parking. Parking is bad downtown, but it is just as bad or worse in some of the surrounding parts. Credit Ryan Keeney

Behavioral Economics 101 meets a Shakespearean Tragedy 

The goal of traditional economics is to study the oh so mythical hyper-rational species, "homo economicus"; the goal of behavioral economics is to study an actual species, homo sapiens. One fundamental behavioral bias uncovered is the irrationality people face when something is free. There's a reason eBay sellers with free shipping fare better, 0% APR credit cards are so popular, and "teaser" (subprime) home mortgage rates may have contributed to the 2008 financial collapse.

Free Zombies
People become zombies for free stuff. We don't stop and rationally consider the opportunity cost or financial wisdom of waiting in line for a free shirt, that 0% APR now means 25% next year, that free lunch comes at the cost of listening to a boring speech, that free shipping means paying $120/year for a Prime membership, or ... do you really want to spend 10 minutes filling out a survey for a "free" taco supreme? Yes, time is almost literally money; perhaps the most hidden cost of all and one of the most scarce resources.

Side Note: Evolutionarily Irrational
Yet, why would we be rational about free things? For most of the millions of years of our evolutionary history, we never even traded, let alone had sophisticated price systems. Barter was invented within the past 10,000 years or so, and the Chinese invented the world's first paper currency within the past 2,000 years. If you struggle to think of parking in rational terms, don't worry you're in good company:

"Thinking about parking seems to take place in the reptilian cortex, the most primitive part of the brain responsible for making snap judgments about flight-or-flight issues, such as how to avoid being eaten. The reptilian cortex is said to govern instinctive behavior involved in aggression, territoriality, and ritual display—all important issues in parking."
-Donald Shoup

But the evolutionary roots of behavioral biases is a topic for another post.

So Tragic
This irrational behavior is strongly exhibited, and possibly enhanced, in the tragedy of the commons. Originally considered in the 1800s by a British economist, in 2009 the study of this issue also lead to the first (and currently only) female winner of the Nobel Prize in Economics, Elinor Ostrom.

Image result for tragedy of the commons
Traffic is the epitome of the tragedy of the commons

Externalities
The logic of this issue is straightforward: if you don't have to directly pay for something, it becomes easy to undervalue that resource. Economists explain this with a concept known as externalities: when two parties transact, if part of that transaction affects a third party without said party agreeing to it, an externality is created. The classic example of a negative externality is factory pollution, such as The Great Smog which killed an estimated 12,000 people in 1950's London, but there can be positive externalities as well. If someone gets a vaccine, that bestows positive externalities upon society (barring Donald Trump's autistic claims).


This wouldn't be an economics blog without a terrifying graph.
Note: if the social cost is higher than the private cost, the actual equilibrium quantity is too high.
Equilibrium Private Quantity Qp > Equilibrium Social Quantity Qs

Competitive markets usually work well if there aren't externalities; but if there's a negative externality, businesses overproduce, and if there's a positive externality, they tend to underproduce. That's why governments often have to step in to subsidize vaccines or discourage pollution with regulatory fines.

The tragedy of the commons is a unique instance of a negative externality, explicit in as disparate issues as traffic, fisheries, climate change, overpopulation, and even voting. Since most roads are free, more people use them than toll roads, leading to overuse of free roads (hence traffic) during rush hour. Fisheries are quickly depleted without explicit and proactive conservation efforts. Climate change is perhaps the most egregious example: no one is charged for abusing the environment, so we get lots of abuse.

At least in these examples, some sort of pricing mechanism tends to improve the situation; strategically located tolls roads can reduce congestion, strictly enforced Marine Protected Areas reduce over-fishing, and many economists believe that a carbon tax will be necessary to prevent catastrophic climate change.



Side Notes:

Is Population an Over-Tragedy?
Having kids is a tragedy (... of the commons) only if your kids are going to be a negative externality on the world. (Of course, your kids would never be that). On the one hand, lots of people means lots of diversity and innovation (positive), but also leads to more tragedies in other areas (such as pollution, competition for resources, and less damn parking for the rest of us). For the more common view that overpopulation is an issue, see here; for the opposite view that population increases are a positive externality, see here.

To Plainness Honor's Bound when Majesty Falls to Folly
Finally, the political system is in many ways a tragedy of the commons, and voting is almost certainly economically irrational. If I spend hours and hours every day researching the best candidates who advocate for intelligent policy positions, that not only helps me, it helps my fellow citizens, much like a vaccine. Conversely, if I have the attention span of a goldfish and only watch the political punch lines, there's no cost to me... so we collectively elect a reality TV star demagogue as president.

Irrationality Sucks... Mostly
Even if people were rational about the tragedy of the commons (i.e. screw my fellow citizens, I'm going to take advantage of X), these issues would be bad enough. But it is amplified by the aforementioned behavioral economics of free. Not only are the incentives in place for me not to care, I irrationally over-consume free goods. But... biases need not all be tragic.

Surprisingly (to only an economist), Elinor Ostrom's research indicated that there may be behavioral factors working in the opposite direction. Namely, people aren't always completely selfish and sometimes care about how their actions affect their communities and the environment. Her research indicates that markets can efficiently organize optimal behavior but only with the help of community and social norms, neither a purely liberal or conservative conclusion. Maybe the hyper rational neoclassical tragedy of the commons model is wrong...

The Horror: Paid Parking
Call me a pessimist on human nature, but in the case of Denver parking I think the best solution is by far to increase the amount of paid parking. Some behavioral norms could certainly in theory improve the situation, but I've seen too many pictures like this to think that altruism alone will solve this affair. And while we wait for Elon's self-driving cars to reduce some on-street parking congestion, don't hold your breath when we could do something sooner and simpler.



With population increasing yearly and density reaching max capacity, we need to face the underlying economics of the situation. Limited space and high population density results in a need to allocate scare resources more efficiently by charging for parking usage. Street parkers are fairly sensitive to price, which means even a small rate could significantly reduce congestion and improve overall welfare.

Source: JusticeMap.Org

Let's keep in mind the basic economics of the situation; since parking is free, this means that the supply of parking is less than the demand for parking, resulting in a shortage of parking spaces, increases in "cruising" or searching for a spot, and even consequently congestion.

We can either increase the supply or reduce the demand to reach a more desirable equilibrium. But increasing the supply of parking, i.e. building more parking lots, is one of the most inefficient uses of land and detrimental to the environment. This is likely why the Downtown Denver Partnership has indicated they do not have plans to expand parking downtown. So the better option is to reduce demand via paid parking. Additionally, this paid parking results in literally millions of dollars in revenue for the government, which can in turn be used for public transportation or other local investments.

The Hidden Costs of "Free" Parking
As any economist will tell you, there's no free lunch; free parking isn't actually free. In general, "free" parking actually means paying parking tickets or spending time looking for a spot. So, while we may individually at times prefer not to pay for a monthly or hourly parking spot, collectively we are paying a lot more for parking tickets or our time, which end up being quite crude enforcement mechanisms for shared resources.

Tickets
As mentioned above, parking tickets make up $30 million in Denver government revenue compared to $10 million for paid parking meters. Part of this is for violations of paid parking closer to downtown, but no doubt a significant amount includes tickets for free parking in high population neighborhoods like Capitol Hill.

People obviously don't like having to pay for parking, but they really don't want to pay for tickets. Denver has $14 million in unpaid parking tickets over 2015-2016, corresponding to ~15% of total tickets going unpaid. Curiously, paid parking increased 30% from 2011 to 2016, but parking fees only increased by 13% (comparable to population growth). This likely shows that it is more politically feasible to add paid parking options than continue handing out tickets. Hopefully this discrepancy means the city of Denver is playing paid parking catch-up (it's a really fun game).

Denver population growth, 2010-2017
Denver population increased about 13%, similar to the increase in parking tickets, but much less than the 30% increase in paid parking. Hopefully, this means the city of Denver is playing catch-up. 

Time

Another hidden cost is the time spent searching or "cruising" for parking. This actually imposes a very real economic cost. For example, in a 15 block business district near UCLA, every year drivers cumulatively lose over 11 years of time searching for parking! On average up to 30% or more of cars on the road in a given city are cruising. "The record is held by the German city of Freiburg- in one study 74% of cars were on the prowl."

Economic Activity
Inevitably, there are times when someone would rather stay in their apartment than go across town and try to find parking in Cap Hill. That means lost economic activity from restaurants, bars, etc. I know I have friends who refuse to come over to avoid finding parking some nights. At least, that's ostensibly their reason...

Minor Accidents
A friend of mine had to pay nearly $1000 for barely tapping another car while parking. (... Should she have left her phone number?) It's quite possible that with painted lines and more clearly delineated paid parking this wouldn't have happened. Bare minimum we should have painted lines to reduce the inefficiency of selfish parking and reduce the risk of petty accidents. I heard some people concerned that we wouldn't know the sizes of varying car models to properly paint the distance between cars; oh come on, its Colorado, everyone drives a damn Subaru SUV.

Viagra for Cars
As urban planning economist Donald Shoup has pointed out, free parking leads to more cars on the road, effectively acting like a "fertility drug for cars." Free parking can subsidize the cost of car ownership by $150/ month, canceling out the tax revenues from gasoline taxes. This hidden subsidy could be costing $150 billion dollars per year in the US alone. Just as one example, the opportunity cost of a single parking space in LA is estimated at over $31k per year, more than the value of many cars on the road!

Side Note: Since reducing car ownership will likely increase ride-sharing and public transportation, Lyft should lobby for paid parking just out of their own self-interest! I had no problem living in Cap Hill without a car, using a combination of biking, ride-sharing, and public transportation; if the cost of parking increased sufficiently, I'm sure lots of people would join my non-committal lifestyle. Also, I recently read in The Economist that the ROI on lobbying is 22000%, which is insane and a problem in and of itself, but also strongly reinforces my point.

Image result for car viagra

Precision without Accuracy
Shoup also points out that most cities require a minimum amount of parking built per construction foot. This is usually a terrible idea, since it encourages more parking to be built than is necessary. Plus it forces a precise solution when every business and building has very different parking needs.

Instead, it is better to implement a market price for parking. As Shoup argues, you ideally want the price high enough that there is always a little bit of excess parking spaces available. In other words, you would never have to look for parking again; there would always be a couple of spaces available every block! Shoup recommends pricing such that 85% of parking spots are taken and 15% are vacant.

Models
Georgia Tech
While it may seem like it would be inconvenient to install parking meters all across Denver, this is actually unnecessary. I was recently on Georgia Tech's campus and parked using an app called ParkMobile where all I needed to do was type a 4 digit code based on area and add my license plate. Everything worked out great, other than they don't have an option to update your license after you paid. Since I put my license incorrectly as Georgia instead of Kentucky, I had to pay for parking twice; 3 (6) dollars for 3 hours, so I almost over-drafted my checking account.

Update: Philadelphia also uses this approach!



San Francisco
San Francisco follows Shoup's rules better than almost any US city after completing their rollout of hourly and location based pricing (temporal-spacial pricing for the nerds). Sensors detect the amount of parking available on a block and charge variable rates, from .25 cents per hour to $6. This leads to increased revenue for the city, as well as minimized time spent searching for parking.

The Georgia Tech approach doesn't require the smart meters to accept credit cards, since all payment is done via mobile. In this sense, that model has a lower upfront cost and maintenance making it preferable.

Distributional Concerns
One obvious concern is that eliminating free parking can disproportionally affect lower income individuals. Having to pay $150/ month for parking can really hurt budgeting. I believe these concerns are legitimate, yet they are misguided for a few reasons.

First, helping poor people by a policy of free parking is like using a Federal Reserve interest rates to encourage home ownership. In the words of former chairman Ben Bernanke, that is like using a sledgehammer to kill a mosquito. Since interest rates affect so many other parts of the economy, it's better to use a fine tuned fiscal policy to encourage home ownership. Likewise, universal basic income or negative income tax credits are more likely to help poor people than allowing all the other distortions resulting from free parking.

Second, Denver residents, especially closer to downtown and Cap Hill, are generally higher income than the average across the country. See Justice Map's income map below.

Finally, the increased revenues from these policies can be used to subsidize public transportation or otherwise directly aid lower income individuals.

Source: JusticeMap.Org


Solutions
Things that would ameliorate Cap Hill's parking issue in increasing complexity/ benefit:

1. Paint lines so that there is less inefficiency in parking usage.
-Cost: <$350k for city of Denver based on $2 per line and the fact that only part of the city would need this. For example, further out on Colfax, parking is not nearly as problematic, meaning painted lines may not be necessary.

2. Initiate partial paid parking of 25% of parking spaces to test elasticity, revenue, response, etc.
-Cost: <$2 million for software, installation. Following the Georgia Tech model and using a third party solution like ParkMobile would likely cut this number by 75% or more.
-Revenue: ~$1.5 million/ year for 1000 parking spots x $100-$150/ parking spot / month

3. Scale to virtually all paid street parking implementing Shoup's optimal rule of 85% usage
-Cost: <$1 million 
-Cost: Likely loss of parking ticket revenue, but almost certainly less than the corresponding revenue gain from paid parking
-Revenue: >$5 million/ year 

4. Fully implement a San Francisco style (Temporal-Spacial Pricing) model, with hourly changing pricing based on supply and demand
-Cost: <$1 million in software costs
-Benefit: As close to an economically optimal situation as possible

All of this revenue, increased economic activity, and reduced wasted time have great effects on the local economy. As such, Denver should implement a plan similar to the one delineated above. As Shoup focuses on, this revenue windfall can be used to further fund public transportation and local community initiatives, which in turn further reduces congestion, inefficiencies, and eventually means even more available parking. Let's save our tears for something better than free parking.