Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Asking the Right Questions About AI (medium.com/yonatanzunger)
107 points by tim_sw on Feb 1, 2018 | hide | past | favorite | 83 comments


This is easily one of the best explanations of AI and explorations of its implications that I've read. Probably because I largely agree with the author, but have never been able to express such assertions so clearly:

> These are rarely new problems; rather, the formal process of explaining our desires to a computer — the ultimate case of someone with no cultural context or ability to infer what we don’t say — forces us to be explicit in ways we generally aren’t used to.

It was also interesting to read his inside account about Google Photos's "Gorilla Incident". I had always assumed it was due to training on limited data but that just didn't make sense as Google would not have been lacking for data in 2015.


The article is good, I agree.

The line you quoted is not specific to AI/ML, it's the fundamental rule of programming (and why I absolutely love it as a tool for understanding) - the machine is the ultimate bullshit antagonist. You can't handwave away your lack of understanding of a problem, like you can do with people. And the process of trying to code a problem (e.g. for a simulation) forces you to learn just about everything you don't yet comprehend about it.


> The line you quoted is not specific to AI/ML, it's the fundamental rule of programming

Very good point. Though the OP's observation comes with additional poignancy/force in the context of artificial intelligence. A lot of people are fine with the idea of machines being told explicitly what to when it comes to doing "mechanical" things. I would guess most of these laypeople do not realize that the same sausage-making process is involved when it comes to having machines do human-like things.


> You can't handwave away your lack of understanding of a problem

well we do use packages written by others all the time which we don't necessarily need to have a complete understanding of.


Point 3.2 is golden, and I'm very happy people seem to be no longer afraid to publish the ugly truth here (it's the second article I've seen touching this over the past month), instead of being in complete denial.

AI doesn't care about our nice "ought to be" fake worlds, and algorithms themselves don't have social biases. So when an ML algorithm spits out something that goes against the prevailing ideologies, it's most likely the fact in the data. If the data is good (i.e. it represents the real world properly), then the ideologies should be questioned.


So when an ML algorithm spits out something that goes against the prevailing ideologies, it's most likely the fact in the data.

You seem to have missed the subtlety in the author's treatment of this subject. His point is exactly that the data is not necessarily "good". It's biased because we are biased [1], and you have to be aware of that when interpreting results. It's the data that needs to be questioned.

[1] if three white teenagers are arrested for a crime, not only are news media much less likely to show their mug shots, but they’re less likely to refer to them as “white teenagers.”


I didn't miss it; I address it in the very next sentence - "If the data is good (i.e. it represents the real world properly), then the ideologies should be questioned." There is a if-then clause there.

The point about mugshots shows perfectly the problem of what happens when the data is biased.

But then the corollary is, sometimes the data is good. The relevant quote from article:

> If you try to manually “ignore race” by not letting race be an input to your model, it comes in through the back door: for example, someone’s zip code and income predict their race with great precision.

This is what happens with loan or crime data sets, and it made plenty of noise in the news recently - but only the kind of noise in which people say it's obviously the algorithm that's broken, because it doesn't fit the "polite fiction" they'd like to believe.


> But then the corollary is, sometimes the data is good.

Yes. but in my limited experience with ML, "the data is good" usually isn't the

>> most likely

explanation :-)

> This is what happens with loan or crime data sets, and it made plenty of noise in the news recently - but only the kind of noise in which people say it's obviously the algorithm that's broken, because it doesn't fit the "polite fiction" they'd like to believe.

People say the algorithm is broken because it's illegal to discriminate on the basis of race, and these algorithms were sneaking "discriminate by race" "in through the back door".

Calling the law a polite fiction won't get you very far when the judge issues an injunction against using your product because it's racially biased.

And the judge doesn't care how machine learning researchers define bias. He cares how the law defines bias.

So, if you're diddling around on your computer in your own time, then I guess the algorithm isn't broken. But if you're building a product you want to sell to courts or insurance companies, the algorithm very much is broken.

That is, unless you think "the law is a polite fiction and bias means only what ML researchers say it means" would be a winning argument to lift an injunction against your product. If you're going to make those arguments in front of a real judge, let me know; I want to see the bulge in that judge's forehead :-)

No. People say the algorithm is broken because it obviously is broken.


So you're basically saying the algorithm is broken because it discovers illegal correlations, even though they may be true.

Well, that's precisely "polite fiction".

(Also, laws are not designed as truth-seeking tools but usable heuristics that take into account limits of individual humans.)


> So you're basically saying the algorithm is broken because it discovers illegal correlations, even though they may be true.

No. Discovering those correlations is completely legal. Making certain decisions based upon those correlations is illegal. Mostly because by making a decision, you make a tacit assumption about causation that's borderline impossible to prove and can have a huge impact on people's lives.

Like I said, if it's just you in your home office having fun, go at it. But if you then bake that model into certain products, you have a very serious bug.

> Well, that's precisely "polite fiction".

Except in this case, it's not even that!!!

The observation that certain crime statistics are highly correlated with race, and that race is correlated with zip code, are not new observations. ML did not usher in some brave new world here. Just because we call it "AI in 2018" instead of "John from the Actuarial dept. in 1960" doesn't change the moral, ethical, or legal landscape. (the article literally makes exactly this point.)

And despite your characterization, this particular truth (race correlates to crime correlates to zip) isn't even a politically incorrect observation!!! It's something everyone already knows, and I've never seen someone attacked for pointing out this correlation. In fact, it's a favorite talking point of social justice types! Pointing out this correlation is not impolite.

The impolite assertion is that there's a causative link between race and crime. That's an assertion that models (tacitly) make when their users shift from truth-seeking to decision-making. See the "question you thought you asked vs. question you actually asked" portion of the essay.

Now, if ML algorithms discovered some genetic, racial, causal theory of crime, then you might have a point about ML exposing polite fictions in this case. But they didn't, so you don't. COMPAS isn't being censored from sharing a politically incorrect truth. It's being prevented from ruining people's lives with a really, really lazy application of statistics.

I often joke that racial discrimination laws are one of the few examples where "being bad at math" is not just criminal, but unconstitutional.

> ...laws are not designed as truth-seeking tools

Again, in cases where bias becomes illegal, these models are NOT just being used to seek truth. They're being used to make decisions.

You seem to have misconstrued the fundamental thesis of the article. The author isn't calling for death to polite fictions. And there's a concrete example in the article of this point of departure between your perspective and his. Namely, his response to people-as-gorillas was not "fuck you, the math is right". And the crescendo of the piece is "AI is just a tool, not a divine oracle, and there's nothing new under the sun".


> This is what happens with loan or crime data sets, and it made plenty of noise in the news recently - but only the kind of noise in which people say it's obviously the algorithm that's broken, because it doesn't fit the "polite fiction" they'd like to believe.

The polite fiction would be that it is only the algorithm that's broken when it is actually the country that is broken.


Yes, exactly.


Huh? It seems to me that the OP is using the example of race correlating with zip code as another example of how data can be misused and misinterpreted, e.g. feeding a dataset that has zip code and income could still lead to an AI making a model not much different than if the data had race in it.


I believe the article implies that calling it "misuse" or "misinterpretation" is preferring "polite fiction" to reality.


The text does not support your interpretation. He says that data in the United States almost always has a racial bias, and that even when you explicitly remove race from the data, you and your ML model can still inadvertently make predictions that fall along racial lines because of how other data points strongly correlate to race.

He then goes on to talk about COMPAS, which was accused of being racially biased even though COMPAS explicitly did not include race as one of its inputs. The author argues that COMPAS still mad predictions that were racially biased because of how strong the correlation is between race and housing. On top of that, the author argues, COMPAS asked the wrong question (who is likely to be more convicted, vs who is more likely to commit crime).

The author does not, as you seem to do, celebrate the Justice system algorithms for making harsher, but objective data-driven recommendations for minority groups. He argues that this is unjust, that the AI is missing the necessary nuance and context, and that the courts should not have been so trusting of AI:

> There are obviously many problems at play here. One is that the courts took the AI model far too seriously, using it as a direct factor in sentencing decisions, skipping human judgment, with far more confidence than any model should warrant. (A good rule of thumb, also recently encoded into EU law, is that decisions with serious consequences of people should be sanity-checked by a human — and that there should be a human override mechanism available.) Another problem, of course, is the underlying systemic racism which this exposed: the fact that Black people are more likely to be arrested and convicted of the same crimes.


>> As I write this, I’m going to use the terms “artificial intelligence” (AI) and “machine learning” (ML) more or less interchangeably. There’s a stupid reason these terms mean almost the same thing: it’s that “artificial intelligence” has historically been defined as “whatever computers can’t do yet.” For years, people argued that it would take true artificial intelligence to play chess, or simulate conversations, or recognize images; every time one of those things actually happened, the goalposts got moved. The phrase “artificial intelligence” was just too frightening: it cut too close, perhaps, to the way we define ourselves, and what makes us different as humans. So at some point, professionals started using the term “machine learning” to avoid the entire conversation, and it stuck. But it never really stuck, and if I only talked about “machine learning” I’d sound strangely mechanical — because even professionals talk about AI all the time.

That's a typically unhistorical explanation of the relationship between machine learning and AI, which is really quite simple: machine learning is a field in the broader reseach subject of AI. "Professionals" did not "at some point" start using machine learning instead of AI. The field got a name when it solidified into a field, just like Natural Language Processing did, or Machine Vision.

In fact, to use "machine learning" as a byword for "AI" is not different than using "NLP" or "Machine Vision" as a byword for AI. It is that wrong and makes it that evident that the person using the terms doesn't really understand where they come from.


The field got a name when it solidified into a field, just like Natural Language Processing did, or Machine Vision.

I'd say a claim of machine learning being "solidified into a field" is somewhere between false and meaningless.

Just look at neural networks. These may be the most successful machine learning or AI technique ever but they're also the technique most dependent on ad-hoc tweaks and tuning of anything so-far. Which seems rather the oppose of solidified.


Machine learning has a journal and a conference, like neural networks do. They're AI fields. I don't know what your definition is.


The Machine Learning journal (which later became the Journal of Machine Learning) was started in 1986 [0], which is 25+ years from when it was first coined [1].

According to Google Trends [2], both computer vision and artificial neural networks, as topics, were more popular than machine learning until 2010. Afterwards, both subfields drop to their historic lows of search interest while interest in machine learning multiples year over year. It's hard to believe that such rapid growth didn't involve a conflation of taxonomy.

[0] https://en.wikipedia.org/wiki/Machine_Learning_(journal)

[1] https://en.wikipedia.org/wiki/Machine_learning

[2] https://trends.google.com/trends/explore?date=all&q=%2Fm%2F0...


If I remember correctly, google trends looks at the frequency of use of n-grams in a google search corpus. I wouldn't expect this to reflect research activity, as such.

The fact that the journal of machine learning is older than the term also doesn't say much. My disagreement with the previous poster seems to be about whether machine learning is a field of AI or not. Well- if it has a journal (and a conference) (or the other way around), then it's a field of study.


I agree that the OP here makes pithy-sounding observations without empirical evidence ("professionals started using the term "machine learning" to avoid the entire conversation") but I don't see how taxonomy and usage of a term is "really quite simple".

According to Wikipedia, the term was coined at IBM in 1959:

https://en.wikipedia.org/wiki/Machine_learning

And it doesn't seem like it was solidified at that time according to Google Books, which shows usage of the term being relatively negligible until the 1980s:

https://books.google.com/ngrams/graph?content=machine+learni...

Even though much of the theoretical foundation of machine learning seems to have been set in the 20th century, but according to Google Trends, searches for the topic remained pretty flat until 2015:

Compared to AI, as a "field of study" (presumably, some of the searches for machine learning as a "field of study are conflated with AI as a field of study: https://trends.google.com/trends/explore?date=all&q=%2Fm%2F0...

When comparing search interest for "artificial intelligence" vs "machine learning" as a search term, though, the latter starts out absolutely dwarfed by the former. But by 2017, the search interest for "machine learning" has increased 9-fold and beats out "artificial intelligence"

https://trends.google.com/trends/explore?date=all&q=%2Fm%2F0...

Neither computer vision nor machine vision nor NLP as topics show remotely the same growth -- in fact, all have fallen since the beginning of Trends' analysis:

https://trends.google.com/trends/explore?date=all&q=%2Fm%2F0...

Again, I think the OP overreaches in guessing why AI professionals seem to be using "machine learning" more interchangeably with their work. But how do you explain the massive jump in interest/search activity recorded by Google in the past 5 years? Isn't it possible that the OP is wrong about the motive, but right in observing that "machine learning" has become trendy term for AI work in general?


>> And it doesn't seem like it was solidified at that time according to Google Books, which shows usage of the term being relatively negligible until the 1980s:

I didn't claim machine learning solidified ("became established" might be a better turn of phrase) as a field right when its name was invented. However, in the present time it is a field of AI research. It doesn't really matter when it started or how - the main point is that it's not an alternative term for "AI", but a part of AI.

I don't know what the google trends searches are meant to show. To understand the development of machine learning as a field you'd have to look at statistics about the publication of machine learning research in scholarly journals- not what's being discussed on twitter and facebook (and even HN).

Btw, may I clarify this? I'm not so much preoccupied with nomenclature here (though I can't help it when my inner pedant revolts). I'm mostly pointing out the fact that the article author has no idea what he's talking about and is just making it up as he goes along. Which is obvious in the rest of the article also.

I mean, really? Every AI has sensors and actuators? And "actuators" is everything from hydraulic systems to html form fields? That's a bit of stretch.

>> Isn't it possible that the OP is wrong about the motive, but right in observing that "machine learning" has become trendy term for AI work in general?

Sure, in the lay press. There's a scientific community that does research on AI that doesn't use "AI" interchangeably with "machine learning". Whose understanding matters most?


Google Trends is meant to show the popularity of search terms and topics. The whole of academic/research literature is too small to be a major factor in overall Google search interest -- even if machine learning was literally the only topic that AI-related journals and academia were talking about, it would not explain why search interest in the topic grew by 900% in a span of 3-5 years, to the point where, as a search term, it's more popular than artificial intelligence.

I don't think you give the author enough credit here. He makes it clear that he is writing to a general audience, an audience that he suspects does not know what AI is, nevermind that machine learning is a subset of AI. The author is guessing, rightly, that a general audience has now heard a lot about a thing called machine learning.

So the author says that this is likely because machine learning has become the trendy or popular way to describe a lot of artificial intelligence work, and that this is why he himself will be using machine learning and artificial intelligence in seemingly interchangeable ways.

The Google Trends data supports this idea. The Wikipedia pages for "Machine learning" and the history of machine learning support it too, in the way they include achievements of the other subfields (like computer vision).

What data or what evidence do you have that "machine learning" has become such a popularly-used (and searched for term) because more "actual machine learning (whatever that is) is being done? When you say "the main point is that it's not an alternative term for 'AI'", where is your evidence that the rest of the world, despite the original intent of the term and its taxonomy, hasn't decided to conflate "machine learning" with "AI"?

I don't mind you being a pedant. I myself hate it when people use "begs the question" divorced from its original meaning -- I'll even stop reading articles in the New York Times when I see a writer misuse it. But just because a linguist or philosopher tells a general audience that "begs the question" is now generally understood to be interpreted as "raise the question", that that person is a complete idiot in their field.

In terms of "whose understanding matters most?" -- I can't even guess what your line of thinking is here. You really think that scientific research and literature operate in a silo from the rest of the world? Even though the rest of the world is a massive factor in whether that scientific research gets funding? I work in academia -- and yes, I have no problem believing that academic research is influenced by what draws popular attention, and that there is nothing about the field of machine research or AI that makes them immune from such pressure, especially since it's possible to use machine learning and AI in the same kinds of articles and research context with no loss of understanding at all.

I'm curious what else in the article you think reflects that the OP "has no idea what he's talking about and is just making it up as he goes along." The OP asserts that he was "the technical leader for Google's social efforts (including photos)". You think he's lying about that? Or that that position must not have required any knowledge of machine learning/AI?


My sincere apologies but I don't think this conversation is going anywere. I would prefer to end it here.

For an introduction to AI, I recommend the classic AI textbook by Russel & Norvig, "AI, A Modern Approach". Go for the most recent edition you can find.


One of these days, sooner or later, this is what is going to happen:

Some people being toted about by an autonomous car are going to be killed. The NTSB is going to blame the car for making bad decisions. The car maker is going to blame the condition of the road, signs, or painted lines. They are going to pin the blame on the state government.

Insurance companies will point fingers at each other, depending on who they represent.

Then, everyone is going to sue everyone.

The state is going to have to explain WHY it let these cars on roads that were not maintained at a level of being 100% compatible with the level of technology in existence.

The state will say they were told the technology was safe, and pull out fancy presentations that were put together with no intention of ever ending up in court.

Technology companies will shrug and say it was marketing speak and, well, you knew the risks.

Then everyone will start covering their butts, and AI in autonomous cars will stop become a thing for awhile.


I think this is giving the NTSB too little credit. Remember, the people who work at the NTSB deal with statistics all day, and for the most part genuinely care about reducing traffic deaths. So I think that, for their part, it'll look more like the way they've handled investigations into crashes with current partial automation (ie Tesla's autopilot): with a lot of emphasis on the overall effect of the system on accident rates, rather than on one specific crash.


We've already had fatal self driving crashes while the self driving system was in operation. The media irresponsibly, but predictably, tried to use it to drive people into a terrified frenzy - which surprisingly did not work. And even more surprising, the regulators were also reasonable. There was an understanding of what caused the crash and that there are inherent risks. This was contrasted by data indicating that, for instance, since Tesla rolled out their autopilot system - crashes as a whole declined some odd 40%.

And finally there is even just macabre data on fatality rates. On average in the US currently about 1.2 people are expected to die in car crashes per 100 million miles driven. Tesla alone has had hundreds of millions of miles driven in its full autopilot mode, and the fatalities are not racking up like they 'should' be. We could argue that maybe people are paying more attention when the autopilot system is enabled, but practice across the industry has shown the exact opposite to be invariably true.

I think it's just been so long since our society had a revolutionary 'physical' technology, that we've become scared of change and progress. Imagine what a change it was going from horses to cars operating on internal combustion engines and relying on brakes to stop these vehicles going far faster than any horse. Or imagine the idea of stringing the entire country up with electric polls with so much energy going between them that anything landing on a wire and touching something else was certain to be killed, and accepting the fact that when these polls, or the lines between them, go down they not only could but on some occasions would start fires, and so forth. Going from human to automatically driven vehicles is a hardly meaningful contrasted against many of the revolutions of technology throughout the times.

And one final point is that we're also being somewhat self centered in this discussion. These technological changes we're seeing are happening worldwide. An image of e.g. Singapore streets with people in vehicles relaxing as they are autonomously driven to their destination would leave the state of the US and its rules and regulations looking increasingly like an anachronism. Certainly something that would cast a dim light on our view of ourselves as the world leader in innovation.


I generally agree with you. It should, however, be noted that you'll need to compare fatality rates caused by human drivers and fatality rates caused by Teslas autopilot under the same driving conditions. I am pretty sure that there is a vast amount of driving conditions under which the autopilot cannot be engaged safely (and will not get engaged). It's one thing driving on a highway and another thing driving in a crowded city. Here in Germany we have comparably little fatalities on highways. Most occur on rural roads and cities.


You have to distinguish between fatal crashes and all crashes. Here [1] are the data for the US if you can tolerate that site. What is the typical condition for a fatal crash? It's extremely counter intuitive. It is driving straight, on a major roadway, in good weather, during the day time, while sober, in a 4 door hardtop sedan. Seriously.

I'm reluctant to go social science and start handwaving too much causality there, but on the other hand I don't think it's entirely unreasonable to observe that those 'easy' driving conditions also tend to be some of the highest speed driving conditions. Creates a nasty combination of of a situation where we might be more subject to lapses of attention, due to comfort, and exponentially increased consequences when that lapse results in a crash.

But the really cool thing here is that these are also the conditions that right now the self driving systems are rocking. In other words there's actually a chance you're right that we're not comparing apples to apples, but instead are actually oversampling conditions where fatal accidents are more likely to happen!

[1] - https://www-fars.nhtsa.dot.gov/Main/index.aspx


> We've already had fatal self driving crashes while the self driving system was in operation.

There is no true (level 5) self-driving system yet, only driver-assist where the system may hand back control at any moment. It's only true self-driving when there's no human in the car.

> Imagine what a change it was going from horses to cars operating on internal combustion engines and relying on brakes to stop these vehicles going far faster than any horse.

https://en.wikipedia.org/wiki/Red_flag_traffic_laws - knee-jerk legislative response driven by a combination of fear and lobbying is always a risk.


That is rather humorous on the red flag traffic laws. Had never heard of that before!

One thing that's nice now about self driving vehicles is that it seems that more or less the entire automotive industry is hopping on board. For some time it looked like it was going to be Google + Tesla doing self driving and I think it being lobbied and consequently "regulated" out of existence in the US was a very real danger. But at this point, which influential lobbying force might, even if we're just speculating, be a concern?


Thank you PJC for hearing what I was trying to say.

Good example, too.


I'm not a particular optimist about AI, but I think your cynical take doesn't account enough for the potential changes in mindset and policy that could occur by the time an autonomous car can take full blame for a big disaster.

First of all, that incident will likely have been preceded by years of semi-autonmous driving becoming more and more mainstream, which means there will be more incidents similar to the Tesla autopilot fatality a couple years ago. And as we've seen, Tesla was unpunished in the NTSB's report, and self-driving investment continued unabated, so there's reason to believe that future fatalities won't cause major outrage because the public will have found ways to rationalize it.

Five to 10 years of mainstream semi-autonomous driving may be enough to do 2 things:

- Strongly reduce the demand for people to learn how to drive themselves, or want to drive at all (especially among the senior-age population). Keep in mind that trustworthy semi/mostly-autonomous driving will likely make Uber/Lyft even more day-to-day afforadable.

- Strongly increase people's comfort with AI and robotics. Sure, AI already has huge impact and influence on society. But not in the day-to-day tangibility that comes with confidently trusting your daily drive to AI. Society's appreciation (and/or complacency) toward AI will likely be at a much higher level by the time fully-auto cars become mainstream.

So when that AI-instigated-driving-disaster finally happens, you have a status quo that's generally favorable toward AI, including the passage of laws that make road conditions more favorable toward AI, both in reducing operational hazards and in mitigating legal culpability. Finally, you have a society far more dependent on self-driving cars -- you really think the millions of people who are 90 to 100% dependent on AI-personal-transportation are going to give up their livelihoods (nevermind their investments in the actual hardware) because of one big wreck?

I'm too cynical to think that society will become more intellectual and rational about things. But I believe that when fully-auto cars are approved, there will be too much acceptance/complacancy for outrage to cause the industry to roll back.

I agree that there will be lot of butt-covering between all the stakeholders when this fatal accident happens -- but I argue that there will be plenty of butt-covering well before this kind of situation.


An accident will happen yes. There are about 3,287 road deaths per day globally[1] already however so hopefully people will view this in context.

[1] http://asirt.org/initiatives/informing-road-users/road-safet...


So?

Exactly what you describe happens all the time in every field.


> Intuitively, this seems obvious and valuable — yet when this is mentioned around ML professionals, their faces turn colors and they try to explain that what’s requested is physically impossible.

Machine learning models have an attention mechanism right? So just classify whatever has the model's attention.


Not sure why you were downvoted, I'm assuming this was a good faith question. I'm going to try to answer your question, but I'm still learning, so take this with a grain of salt.

Some models have attention, but it is not a requirement.

In models that do have attention, it may or may not be simple. In image captioning for example, you can do just that. See [1] for some pictures of this happening. But in this example, they are stepping through a caption, and seeing where the attention is focused for each word. Works fine for a short caption. Videos are just many images and sound, but already this is going to be more difficult. For natural language processing, you will be stepping through very many words. Some models use characters instead of words. Not sure how you could even make sense of that. I have seen people look at the attention in machine translation. Not sure how it would work for sentiment analysis.

So you aren't exactly wrong, but you are greatly over simplifying this for the models where it is possible(based on my understanding). Overall, machine learning is very much a black box, despite efforts to the contrary. It also doesn't solve some of the other problems. For example, knowing the model is focusing on zip code doesn't help you remove bias that comes from the data.

1. https://arxiv.org/pdf/1502.03044.pdf


I wonder if humanity is going to become a lot more introspective once it becomes a parent to another intelligence. Things look a lot different once you start seeing your own biases reflected in the child you're trying to raise to be better than yourself...


Unfortunately not. Humanity has plenty of opportunity to examine it's biases now and for the most part, most choose to ignore them or deny them.

Just like people think They're smarter than they are, and are right more often than they actually are, and more in control of their decision making than are, they also think they're less biased than they are.

There is plenty of opportunity to confront that now around the world and for the most part people dont.

Humanity will decide to address biases when we decide / are convinced it's the right think to do, not because availability of more data.


Maybe we are already trying, but it's pretty damn hard?


> Humans are terrible at driving cars: that’s why 35,000 people were killed by them in the US alone in 2015.

Ahg misleading numbers. People are actually surprisingly good at driving. Wikipedia says that the US driving fatality rate (which is worse than most other countries) is "7.1 road fatalities per 1 billion vehicle-km". You can tell it's low because it's measured in "per billion vehicle-km".

EDIT: When I said "misleading numbers", I should have said "misleading units". The number 35k people killed in 2015 measures part of the toll that driving takes on the human race, and it is terrible. However, the units of deaths/year is not a measurement of how good or bad people are at driving. Deaths/mile-driven is such a measurement.


I think these 'humans are terrible at driving' statements, while true, also forget to mention that 'computers are also terrible at driving'. There are near infinite edge cases in driving a vehicle, and I have yet to see any convincing proof the current machine learning efforts are truly up for the challenge. Edge cases are where computers fail: That general intelligence they're missing is needed to decide what to do. Even Waymo is offloading that to a call center of remote drivers. Humans can definitely be faultier at consistent driving in a way computers are not.

I think the ideal case remains investing in safety systems that ensure inattentiveness of a driver doesn't cause an accident, while still leaving the human in control of the ultimate decision making of a vehicle.


Presumably, computers can get better at driving cars while humans cannot, so eventually computers will win this particular competition like they've won on chess, go, and tons of other things.


Chess and go are closed systems. And humans can get better at driving, especially if they have better tools. And really, if you want to maximize safety, you can just make cars safer and slower. Then it won't even matter how good of a driver you are.

But what computers are good at is not getting drunk, tired, and distracted.


You have hit the mark. And it isn't just exceptional circumstances; there is a skill and expectation distribution amongst human drivers that can be eliminated here.

We don't really care about the best human drivers. The challenge is preventing the worst human drivers from sitting behind the wheel and eliminating edge cases where an otherwise competent driver is not fit to drive due to some personal circumstance.


To piggyback off what you said -- "best" is much harder to define in driving compared to chess and Go, because those games have a much more defined victory condition than does normal driving. How do you "win" at (non-competitive) driving? A driver with a 0% accident rate might not necessarily be better if they drove at a speed of 0.5 mph.


Humans can get better at driving. Instructional campaigns, assistive technology, art, these things can all make ya better.


Sure, but when you fix a bug in the driving software you deploy it to all cars simultaneously. It's harder to teach all humans about a corner case you encountered only a couple of times.


In 2018 it is still easier to teach the humans as a point of fact.


Similarly, all your cars can decide slamming into a wall at full speed is a good idea; they all run the same software.


Ridiculously expensive for only a marginal improvement, and then people still have their hard wetware limits of concentration and reaction time. Machines have orders of magnitude better reaction time, are always 100% focused, and on aggregate, are much cheaper to teach.


If society makes driving safety a priority, it can be made really, really safe.

For example, school buses are the safest mode of transportation of any kind whatsoever in the US. Millions of kids ride naturally millions of miles per week and fatalities per year are usually in the single or double digits.

Professional drivers, professionally maintained equipment and set procedures really are enough. Most accidents happen as some combination of people in bad mental states, equipment in bad conditions and bad stuff like weather.

Remove that and driving is something people are pretty reliable at doing - especially, people don't really need faster reaction time 99% if they drive sensibly.

http://www.nsc.org/learn/safety-knowledge/Pages/news-and-res...


School buses are also more or less a long tank. They are built like rolling fortresses, they weigh a lot, and they are going to "win" any collision with anything less than a train most of the time. (You do not need the kids to wear seatbelts because, again... really nothing is going to stop that bus.) I saw an the remnants of an accident between a car and a school bus once. The car was smashed to bits. The bus did not look like it had been hit.

This, of course, doesn't scale. If everyone drove buses then buses would not be as safe. And let's not even talk about gas mileage...


I don't think GP's point is really about buses. To quote:

> Professional drivers, professionally maintained equipment and set procedures really are enough.

There is something to that. We let people drive cars after close to zero training (something I personally find to be completely ridiculous, but the world is what it is...), and then we let them drive without any direct consequences of not paying attention (penalties for breaking traffic laws are absurdly low, and enforcement way too lenient). For the other extreme, compare with professional pilots.

That said, I feel that getting all cars self-driving will be much easier to achieve than restructuring society around professional driving only, or training every driver properly and giving them appropriate incentives.


Reaction times is not all you need to drive safely though. You also need situational awareness, which means an understanding of your environment. Which machines can clearly not do at all, yet.

Until AI can develop an understanding of the world that is on par with that of humans, it will remain unable to tackle the breadth and variety of situations that humans can.


Ok but, isn't it fair to assume we're already doing everything we can in that regard? There's only so much energy we as a society can invest in making better drivers. AI seems to have no limit to speak of.


Chess, go and tons of other board games. Outside of that very limited sort of activity, computers have "beat humans" in maybe machine vision and speech recognition, but then again only on very strictly defined challenges. And once the same (kinds of) systems are deployed in the wild all sorts of tiny little details emerge that suggest maybe the computers are not so ready to take over the world, yet.

Like, oh, I don't know- adversarial examples fooling state-of-the-art machine vision systems, or speech recognition failing for all but a very few target languages etc.


OTOH, I could reductio, humans can get better at language translation, while computers cannot.

Think of it this way: driving is a translation problem.


Why can’t computers get better at language translation?


I translated many Russian scientific journal articles to English via Google Translate, and the improvements since 2015 are striking. A friend of mine pointed out that might be largely due to the corrections I provided. Either way.


Well, I've been using Google translate to translate small sentences of Greek to English and back and the results have been consistently abysmally bad since the start (whenever that was- I can't say I remember; but before 2015 anyway).


Indeed, they demonstrably have over the last decade.


The only metric which matters is fatalities per billion km, which is lower for machine drivers.


Only if you assume that the miles are equivalent, but they aren't. The machines are handed the easy work because that's what they're currently good enough at handling, and they have a human standing by to take over at a moment's notice.


While I agree we should control for distance when comparing humans to machines I think we should also compare the type of driving. Are machines safer at driving on ice? How many human-driven fatalities happen on ice or in inclement weather?


On ice... Or driving without secondary human intervention. When a human is intervening every 1-1,000 miles do we really know how machine driven fatalities compare to human driven fatalities?

So far we have metrics on how autonomous driving fatalities where the machines are treated like a student driver in limited road conditions -- they don't drive in the rain and an instructor is ready to take over.

I wonder how the fatalities per billion miles compares between autonomous systems vs student drivers in driver's ed. I don't recall a story about a driver's ed fatality and there has probably been a lot more miles in drivers ed than autonomous miles.


> When a human is intervening every 1 1,000 miles do we really know how machine driven fatalities compare to human driven fatalities?

Do you think remote drivers can intervene so fast as to prevent a crash? I think it's only useful when the car is blocked in traffic in an unusual situation, such as when something happened to the road, or there is a protest, or something like that. SDC's should be able to stop on their own when a dangerous situation appears.


Waymo likes to play up statistics that don't matter. For instance, most of their "hundreds of thousands of miles driven" statistics they brag about in every press release constitute driving around the same handful of streets in Mountain View, CA. They don't experience variety of driving, just... the same set of roads repeatedly. In a place with ideal weather. And they can tick up a disengagement count to prevent an accident whenever they want.

The statistic we deserve is "contiguous miles driven without intervention", and it needs to be spread across a large geographic area (something Waymo cars are completely incapable of right now due to their mapping requirements).


So by your rationale, we should be deploying self-driving cars now? Because AFAIK, the number of fatalities per km for cars classified as self-driving is zero, especially if you ignore all the accidents in which the human participant tried to intervene. Which of the self-driving car companies do you think would argue that their cars are now safe for general usage?


Does anyone have data on the average income or vehicle prices related to these deaths? How many accidents are being prevented by new currently available safety technology (proximity and blind-spot warnings, etc)? It's not like the advent of AI cars means that everyone just gets one, so this number isn't going to go down significantly unless you force poor people off the road by mandating that only AI get to drive or else give everyone an AI in their car for cheap.

Personally, I don't think the people driving this vehicle AI revolution really care about that number though. It's just what gets wheeled out to appeal to the public in order to justify the next luxury car industry for rich people who want to travel worry-free while intoxicated/sleeping/working/netflixing and to reduce the labor cost in all transportation services from trucking and busing, to operating construction and farming equipment.

The lives AI will save will be rich ones, and the lives that are ruined from increased insurance premiums (by not having an AI vehicle) and the destruction of entire labor markets will be poor/middle-class ones.

Is there a solution I'm unaware of? Is UBI going to cover those costs? Or is the rising inequality and resentment just not a concern since the AI will just solve that next?


Don't worry, the SDC sensors and hardware will be cheap, and software open sourced soon enough. It will be a technology for everyone like the cell phone - even the poor will have equal access. We're good at making cheap hardware, and there will be a push to make it available to everyone.


A RAND corporation study quotes similar data:

But is it practical to assess autonomous vehicle safety through test-driving? The safety of human drivers is a critical benchmark against which to compare the safety of autonomous vehicles. And, even though the number of crashes, injuries, and fatalities from human drivers is high, the rate of these failures is low in comparison with the number of miles that people drive. Americans drive nearly 3 trillion miles every year (Bureau of Transportation Statistics, 2015). The 2.3 million reported inju- ries in 2013 correspond to a failure rate of 77 reported injuries per 100 million miles. The 32,719 fatalities in 2013 correspond to a failure rate of 1.09 fatalities per 100 million miles.

https://www.rand.org/content/dam/rand/pubs/research_reports/...


re: your edit; it's not necessarily true that deaths/distance is a better way to assess how terrible humans are at driving. People who are able to drive many miles (relative to the average) likely drive regularly on the interstate, which has been designed to make driving as decision-free as possible, even at high speeds.

Maybe the author's fault was to use deaths as a metric when accidents would have been far more telling of how frequently humans commit errors while driving, especially the kind of errors that computers do not make. Namely, computers make far fewer errors the more regular and controlled the operating conditions. Humans, on the other hand, can get into even more accidents the closer they are to their own home.


But there needs to be some normalization because machines drive many fewer miles. How do you suggest we normalize the number of fatalities for the sake of comparison?


But the OP doesn't argue that machines currently have a better fatality rate. And so it's seems nonsensical to say that his use of the fatalities/year metric is more "misleading" compared to fatalities/km.

The OP used the fatality rate to say that humans were terrible at driving, because so many fatalities occur out of basic, preventable human error, despite all of the money and research and policy invested. And the context of the comment is in pointing out human fraility when it came to consistency at mundane, mechanical tasks.

I would agree that the OP's use of fatalities is perhaps sensationalist. His point would have been far more clearer if he talked about all kinds of accidents.


On what basis do you say that the OP's numbers are "misleading"? What's the objective benchmark on what number of fatalities/accidents is acceptable per billion miles?

Also, fatal accidents are not the only kind of error possible when it comes to driving.


> On what basis do you say that the OP's numbers are "misleading"?

It's misleading because of the units its presented in: deaths/year. Those units measure (part of) the cost of driving on the human race. They do not represent how good or bad people are at driving.

> What's the objective benchmark on what number of fatalities/accidents is acceptable per billion miles?

I don't mean to say that the current number of deaths is OK! I want cars to get safer. However, when you say "people are bad at driving, and kill 35k people/year", it sounds like it's a low bar for self-driving cars to pass. However, when you say "people kill 7 people per billion km driven", it's clear that the bar is high. Tesla Autopilot has killed 2 people so far (I believe?); has it driven 280 million km?


Granted, the choice of metric and scope of measurement is as much an opinion as concluding that humans are "terrible" at driving, but I still don't see how OP's choice of metric was more misleading than yours. And Tesla is not full-AI/automated, nor do I think the OP argued that Autopilot or any AI system was any better than humans.

I think you misinterpret the OP if you think he was making the case that self-driving cara have a low-bar to pass. It seemed to me that his whole spiel is that AI is a very difficult thing, and that the issues about AI and self-driving are not just about technology.

To be fair, the OP's phrasing is open to interpretation. I see him arguing that humans are bad at mechanical consistency, and this is most evident in driving because driving is a highly mechanical activity that most of us do. That more of us don't die is because of the many decisions made outside of our control. Not just automotive tech, but law and policy and zoning decisions that reduce the frequency and variety of scenarios in which we have to make a split-second reaction.


> Tesla Autopilot has killed 2 people so far (I believe?); has it driven 280 million km?

Ha, what? Tesla Autopilot cannot kill anyone. The drivers of those cars are responsible, not Tesla. You are making things up. Tesla has been very clear that you are responsible for the vehicle when Autopilot is engaged. It is not a self-driving car.

Also this is nonsense:

> It's misleading because of the units its presented in: deaths/year. Those units measure (part of) the cost of driving on the human race. They do not represent how good or bad people are at driving.

Deaths/year is a perfectly valid way to measure and discover that humans are bad at driving. If humans were not bad at driving, that number would be a lot lower, close to 0.

Also,

> However, when you say "people are bad at driving, and kill 35k people/year", it sounds like it's a low bar for self-driving cars to pass.

No, it does not sound like a low bar to pass. Driving is hard.


Uh, no...

> You can tell it's low because it's measured in "per billion vehicle-km".

No, you can't tell it's low because of that at all. What it means to be "good at driving" is entirely relative, just as what it means to be "low" is.

7.1 fatalities per billion vehicle-km is still a lot if you expect there to be 1 fatality per billion vehicle-km... Or if you expect there to be 0... The general idea is that machines could get that number much lower, and so - if that's your measurement of quality - are much better...


> Ahg misleading numbers. People are actually surprisingly good at driving.

What? No they are not. If that were true then hundreds of thousands of Americans would not die due to car accidents every decade. Just because there are hundreds of millions of us does not make us good at driving, if we 'only' kill tens of thousands per year.

> You can tell it's low because it's measured in "per billion vehicle-km".

You can tell that the CO2 in the Earth's atmosphere is low because it's measured in ppm. Not something we should be concerned about, right? And when we increase it by a few hundred ppm, no big deal right?

What? You are the one driving misleading statistics around. You are the one diminishing the lives of those murdered by drunk drivers. People are dying every day because humans are really, really bad at driving.

Let's say this again. People are not surprisingly good at driving. People are very very bad at driving.


Are humans good or bad at driving? Until a few years ago, humans defined entirely what "driving" was. Now two things can drive, humans and the autobots. Currently it seems we are better than the autobots. We are still the best drivers around, and good enough to make driving legal and widespread.

Deaths per mile or whatever summary statistic is just descriptive; I don't see an obvious, natural baseline.

Perhaps we could say something about the question are human-controlled automobiles a safe form of travel relative to other modes of transportation? Or maybe calling humans bad drivers just highlights tbe risks of driving. But assessing the quality of driving of the human population still seems tautological to me.


Yonatan and I no longer see eye to eye on a lot of things, but this is a really good introductory post to how AI works and certain ethical questions around it. It's a long, but extremely enjoyable read.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: