The best answer to this is the success of JavaScript. JavaScript historically has been a deeply flawed language. However, it was able to become one of the most widely used languages precisely because, despite all its warts, it implements the core foundations from the lambda calculus correctly (lambda functions, first class functions and closures). This allowed JS programmers enormous power to overcome these warts. Here are two examples:
JavaScript historically has no way to do namespacing. In most languages this would be a deal breaker. But because JavaScript has lambda functions, closures and first class functions a solution could be crafted from scratch! Immediately-Invoked Function Expressions (IIFE) were one of the most powerful early techniques to create scopes on the fly in JavaScript. Without these tools from the lambda calculus you would normally have to rely on language level changes to allow these problems to be fixed.
The other early challenge of JavaScript was the need for asynchronous callbacks. Often these require passing around data that you might not have access to when a function is written. Lambda functions allow programmers to quickly write ad hoc logic. First class functions allow programmers to pass around this logic. And most importantly closures allow you to create functions on the fly based on data that is not known until an asynchronous callback is applied. Again without the core ideas of the lambda calculus in place this type of power would require significant language-level design changes.
10 years ago JavaScript was a hideous language, with many major issues. But because in this mess was contained the power of the lambda calculus the language was able to be salvaged and extended to a wide range of uses.
> it was able to become one of the most widely used languages precisely because, despite all its warts, it implements the core foundations from the lambda calculus correctly
You think that's why Javascript is widely used? I think it's widely used because it was already useable within the browser and people just kept going with it.
The Neural Network Playground is great for understanding this[0]!
The default example is classification of a circle of one class surrounded by a donut of another. There are two features x_1 and x_2 (this is the "raw data").
One solution to this problem is to use a single layer and a single neuron but engineer features manually. These manually engineered features are x_1*x_2, x_1^2,x_2^2, sin(x_1) and sin(x_2). Here's a link to this model (long url)[1].
This model performs very well at learning to classify the data just by combining these manual features with a single neuron. The problem is a human needs to figure out these features. Try removing some and observe the different performance given different manual features. You'll see how important it is to engineer the correct ones.
Alternatively you can have 2 layers of 4 neurons [2]. In nearly the exact number of iterations this network also learns to classify the data correctly. This is because the non-linear interactions between neurons are actually transforming the inputs the appropriate ways. That is to say the networks is learning to engineer the features itself. Try removing layers/nodes and you'll find that a simpler network will have a harder and harder time at this.
I recommend playing around with the various tradeoff between manually engineered features and network complexity. The interesting thing you will observe is that in some cases the manual features are much faster to learn a simplier model than the network. The big issues comes up when we can't simply "see" the problem in 2d so we have no idea what features may and may not be useful.
Even worse there are people (utter frauds of course) who confuse MAD and MAD! In all seriousness, there is much confusion between Median Absolute Deviation and Mean Absolute Deviation out there. Ironically the MAD in this article is still not a robust measure of variation in data as it will break for the many distributions that have undefined/infinite mean (Cauchy and Levy as examples).
Even then many summary statistics rely on a well-defined PDF which is also not true for many real life cases. I think most data scientists out there are very familiar with quantiles, which are often more useful as all random variables have a CDF (and the quantile is just the inverse CDF).
I quite enjoy Taleb's writing (I tend to find his ego a bit amusing) but I think even he is guilty of Jaynes' "Mind Projection Fallacy"[0] in regards searching for more meaning than exists in Fat-tailed distributions. When we model our data with infinite/undefined mean and variance distributions we're just saying "I don't know". No amount of cleverness with summary statistics, or understanding of pathological distributions will create information where there is none.
The overall point being: there are many, many ways of viewing statistics and it's pretty trivial to find a perspective that allows you to call someone a "fraud". Sure there are actual frauds in data science, but one of the biggest strengths in this trend is bringing quantitative people from a wide range of backgrounds to gain refreshing insights. It is much more useful to encourage cross-discipline exploration than to simply say "you don't belong here".
A single layer autoencoder with n nodes is equivalent to doing PCA and taking the first n principal components. If you're familiar with PCA in natural language processing, which is called Latent Semantic Analysis (or Indexing), projecting high dimensional data on a lower dimensional surface can actually improve your features. This is because similar words will project onto the same Principal component allowing you to model some semantic information.
Autoencoders with more than 1 layer are more interesting because you end up doing what is essentially non-linear PCA by projecting your data onto a curved manifold. This famous paper, "Reducing the Dimensionality of Data with Neural Networks" [0], by Hinton shows the improvement in how linearly separable documents become once multi-layer autoencoders are used.
The old argument was that unsupervised pretraining helps get proper weights faster, but this has largely been disproven. However, I do believe AEs assist in semi-supervised learning because they project the initial data into a more useful space. As you can seen in the article I linked the projected data are much more linearly separable.
And as a practical evidence: I used a 5 layer AE in the kaggle black box competition [1] to eventually outrank of team of Hinton's grad students. The problem had a larger unsupervised data set with a small number of labels. Using the autoencoders before the MLP ended up nearly doubling our team's score.
Thank you for the answer. That makes very much sense.
Just a side note: As far as I know a single layer autoencoder and PCA are only equivalent if all units have no activation function (linear activation function), which is usually not the case.
A very useful visual to add next to this is "The Most Common Job In Every State"[0] map from NPR's Planet Money. As soon as I was aware of the research in autonomous driving I was pretty confident trucking would be the first industry hit. Until I saw that map I had no sense of how major of a disruption this could be.
Disruption nothing, this is going to be an economic apocalypse for huge swathes of the country. We've never seen an automation wave like this one. It's exciting times to be alive for sure but I'm also legitimately worried if we're going to be able to handle it as a society, if anything is going to tip us into needing basic income, this could be a serious contender.
It's not even just the trucking, it's truck stops, highway diners, tiny motels all over the place, there is a massive amount of infrastructure that was built because of and for truck drivers and their needs, and that's still not it because there are tons of small town economies built on the last job that pays well without a college education that more or less anyone can do.
No it won't, and it will come with a few of it's own (getting American's past the bootstrap narrative is probably the biggest single one) but it is a step in the right direction.
That being said if you can think of another example where we put 3.5 million people out of work nearly overnight along with an entire support system built on their salaries and needs, I'd love to hear it. The only similar thing I could think of would be the rise of automation in American factories, and even then that didn't replace EVERY human with a machine, and carried with it a certain PR cost for the companies involved, whereas I think a shipping company removing humans from their trucks would have a PR boon, not bust.
UBI has nothing to contend with the boostrap narrative. You just shift the target. Rather than pulling yourself up by your bootstraps to not starve or die of exposure, you pull yourself up by your bootstraps to do anything at all beyond basic survival with a roof and bread. You want a car? Go work for it. You want TV? Go work for it. Etc.
UBI is only meant to reduce the demand for total income enough that people can recreationally work for things they want, because there isn't enough work to go around for all the things they need. If you wanted to live on a UBI without additional income, you could spend your days at parks or libraries, but by design you should not have the income to be purchasing luxury goods - if you want those you can seek work for them, and because of the drop in labor demand UBI causes, you should still be able to find something.
I have a hard time believing under any economic system Americans would be content with bread when their neighbor is eating steak. That is the whole basis of the consumerist culture that powers a good portion of American capitalism.
And the Soviets are a terrible example. Centrally planned economies of course destroy all individual incentives to accel or improve.
But if you seriously think the only way economic systems like the US's can function is by threatening starvation and exposure to incentivize working, then what is the point of progress to begin with if we are stuck in the same vicious cycle regardless? All progress is effectively meaningless if at the end of the day you are still laboring to not die of hunger.
And the Soviet's didn't have bread to begin with. Their people starved not because of "free" stuff, but because there was "no" stuff so long as the central planning existed and was as inept as it was (that, or intentionally restricted).
>And the Soviets are a terrible example. Centrally planned economies of course destroy all individual incentives to accel or improve.
Which economy is not centrally planned? We have more and more central planning every year.
The Soviet Union did not collapse because of central planning. It collapsed because everyone was lazy and stole shit from the factories. No productivity, no quality. People were drinking at work and were utterly incompetent. This is what happens when you remove the incentives for improvement. If you guarantee people they can eat, no questions asked, they just stop caring after 10-15 years. They just forget to care, this is the new reality for them. No repercussions, why even bother. Who are you to tell me to work harder? Why should I work at all? I am ENTITLED to your money so I can buy food.
And when the proletariat smells they can get other people's money, will they stop at just a pinch, so they can buy the very basic necessities? Or will they go and protest and push for more money in 15 years? Those fucking 1%-ers! They are not better than me. Who said they are better than me? Why should they have all this money. I need to feed my kids! And for booze.
The Soviet Union collapsed because they distributed everything to everyone, the problem was it was all terrible and the supply was insufficient due to any one of many things such as problems in distribution, corruption in the suppliers, etc. People who were stealing were by and large doing so out of desperation, which is already happening now in multiple areas around the US and elsewhere.
The vast majority of people when given access to a basic amount of money (not benefits, not stamps, just money) will spend it in such a way as to NOT cause them additional misery via drugs, alcohol or by starving their children. Yes, some will but the "welfare queen" is a myth perpetuated by people who stand to benefit from the social systems being cut back. They exist but it's such a vanishingly small percentage that they might as well not exist, in comparison to the total welfare budget it's a rounding error.
I've never heard of one person protesting against the rich saying that we should all be equal. I used to think that's what they were saying, but in actuality people who want equality want equality of opportunity, not equality of result, and to say that a kid growing up in rural Kentucky with the best of circumstances available there has the same opportunity as a kid in the suburbs of San Fransisco is laughable on it's face.
The simple fact is that a tiny fraction of our population can create a lot more wealth than the rest of them doing all the busywork they could possibly do, and this is only going to get worse as automation continues to increase. We need to rethink the idea that you need to contribute to eat because it simply isn't true anymore. Now, we can either start giving people what they need to live because they're humans and shouldn't be left to freeze to death, or we can roll tanks on the neighborhoods with the most have nots and keep up the crime, keep the property values low, and generally treat people like crap. I give it 50/50 either way at this point because so many people are so married to this idea that the only way you should be kept alive is if you're a benefit to someone else and I'm sorry, but in the wealthiest nations on the planet with celebrities and CEO's raking in billions, you'd think we could manage to find some cash somewhere to keep people alive if for no other reason than we CAN do it.
Does the pace really matter? The coal miners of the Kentucky and related areas have had decades and way too many are still dying of drug overdoses while waiting for their 'next big break.'
"The prominence of truck drivers is partly due to the way the government categorizes jobs. It lumps together all truck drivers and delivery people, creating a very large category. Other jobs are split more finely; for example, primary school teachers and secondary school teachers are in separate categories."
... I really wish they'd collapsed some of those categories, could have been even more revealing.
It's not just the jobs. Self-driving trucks will transform an industry where large amounts of humans are necessary into (yet another) industry where capital is the only thing that matters. Once trucks are self-driving, larger companies will gain the advantage and one of the main economic sectors dominated by small to medium companies will become dominated by large international businesses.
Geoffrey Hinton's "Neural Networks for Machine Learning" on Coursera [0] is a pretty excellent course to cover the basics. There's a lot in the course, but if you just skim over the videos you'll get a pretty good "big picture" view of what's out there. As with any quantitive topic, it's best to take a first pass where you just glance at the math and come back later to really focus on the missing pieces
An important thing to realize is that much of deep learning is decades old neural networks research that has for one reason or another become more viable recently.
If you really want to dive into the technical details there's really no better book than Hull's "Options, Futures, and Other Derivatives". It's extremely well written and if you have a basic understanding of calculus and probability the math isn't too difficult. The only catch is that it is a very expensive book, but if you buy a used copy a few editions back it is more affordable. Also don't worry about the "derivatives" subject matter, if you want to understand derivatives you naturally have to understand the underlying instrument. If you just read the first 100 or so pages (covering Futures pricing) you'll have a pretty good sense of the basics of thinking about financial markets.
I really recommend this book even if you're not interested in Finance as a general guide for thinking about stochastic processes in a practical manner. Nearly all "basic" business/web metrics can be understood best if you understand how to correctly model financial instruments. Personally, I think the basics of quantitative finance are just as relevant to Data Science as machine learning is.
I guess "data science" is a pretty big term, so it depends what corner of the bubble you're in. My interest leans toward computer vision. The first thing that popped into my head is Kalman filters being useful in time series analysis, whether it be equities or object tracking.
Here's an interview question I always ask that has worked pretty well:
"If you could wave a wand and instantly change one thing about this company/job/team, what would it be?"
This is similar to "what is wrong" but frames it in a positive light, so people are more open and creative.
If the answer is anything about people "I wish communication was better", "It would help if more people were on board for this project", "A change in management wouldn't hurt, haha j/k" etc. That's a red flag.
If it's about non-people "I wish we didn't have so much legacy code", "I would love it if we could get our testing setup better", "There are no good places to get coffee around" that's a good sign that aren't major people problems.
If they can't think of one, that's a real cause for concern!
This is one of my favorite questions in general because what people wish for tells a lot in many ways about the major problems, but without people begin guarded. They're fantasizing not venting.
The problem with it (at least in my experience) is that most people don't know how to answer this, so they'll give the most generic stuff they can conjure at the top of their mind. Furthermore, your interviewers aren't given any incentives to telling you what they really think, and they are much less likely to tell you something negative about the position when put on the spot like that.
That said, I usually ask this question when I can't think of what I really want to ask. :)
This might be true for a software engineer role at most companies, but if you're seeking a management role, especially if you're to be the first manager at a startup, then you're probably being hired specifically to address these people problems.
This is quite close to what happened with my first job, which was also my first experience with interviews.
The difference of course is that I didn't have the insight to ask the right questions, however that didn't pose a problem because the first thing the CEO said to me basically answered exactly this.
After two rounds with the person who would become my boss, I proceeded to the third round on one simple basis: I still had no clue about what the company actually did (information logistics. Obvious?). I also had no clear picture of what I would be doing, only generic expectations. So I proceed to the third round, sitting in the room with my headhunter (which was absurd in itself at the time), and the CEO opens the door, walks in, and exclaimed in frustration:
I fucking hate teflon heads.
At that point, I immediately knew this was the job for me, whatever it was. For me, at the time, only a few things were truly clear: this place had serious problems, the attitude of the management was not in line with that of the employees (seemed clear even without having meet a single individual), and there was a serious gap between the managements goal and the organization's ability to execute. To be honest, all that went on mainly in my subconscious, at the time I told myself that this place is in such deep shit that I can't possibly fuck it up any further.
It turned out to be the right call, the organization was blowing in changing winds, and the current (20) employees were doing their best to maintain business as usual rather than business as necessary. Due to the chaos and my initial understanding of the situation,i quickly progressed and ensured that whatever I did, I would be making clear and firm decisions in line with what was needed to achieve the main goals, regardless of whether or not it was my mandate to make those decisions or not.
I realize that this could all have gone the wrong way as well, I could be permanently suck implementing new customers in Cobol (new from scratch Cobol code in 2007), but instead I pushed for complete and fundamental change. Such is the path to architecture, led through example, confidence and communication.
For the rest of the interview, the CEO continued to deliver the same canned pitch of what the company did, with minimal interest, just wanting to sign and get it over with. That first sign of weakness was enough to predict what kind of troubles I would be facing. I went eyes open into a troubled organization, and stick with it for seven intense (but moderately happy/frustrating) years. Wouldn't trade that experience for anything, but at the same time I would never recommend that company as an employer to anyone. I now know that I can thrive in this kind of negative situation and improve it for everyone involved, do what would be a clear red flag to some seeking a simple happy path is an open invitation to me.
All in all an extremely rewarding adventure, filled with challenges of every type at every turn.
Thank you for this. Beyond correctly identifying the situation you were walking into, what indications did you have (or could someone look for) that you would have the latitude to turn it around?
Basically, if memory serves right, I based some of this on what I could glean from the organization's website. The company in question was a small (20 person) national/regional satellite of a larger (1500 person) multinational, daughter company of an even larger (30k) national/government corporation. To simplify, this is Europe, specially the Nordic region, so small national populations require their own footprint, and a massive player in one country may not be heard of in the neighbouring country.
From the website, I could clearly see that the company ambition wasn't anywhere near the reality I was presented (and even from there I couldn't quite grasp what they actually did, though to clarify with hindsight, information logistics basically entails accepting massive amounts of documents for distribution from large businesses, interpreting and transforming those to a destination format and layout, running distribution optimisation, and finally multi channel distribution to electronic archives, postal services, electronic invoicing, email, et cetera..)
Given the attitude of the CEO, it was clear he was painfully aware of the disparity, and I decided that the position I was being hired for had the potential to do something more should I be up to the task (though fairly clear that the CEO had given up on humanity in general and didn't have any idea of how to get out of the rut except doing more of the same to keep up with demand).
To be clear, this could all have gone terribly wrong, and if I hadn't spoken out of place at a few occasions, it would have.
Generally though, the first thing I consciously did to put myself in a position to gain the trust of the employees (which is always fundamentally critical in a transformation) was to clearly request that I start work early, in a low position. In my case, that meant working the floor of the print house, dealing with the Xerox dp180 printers and some Pitney Bowes enveloping machines. This was a menial area, relatively devoid of tech except the so called production control system which was a bastardized as400 running some of software from 1994 which had been repurposed to be the finance system at some point..
I think I would generally equate this to starting out in helpdesk or local it support for a few weeks. Absolutely invaluable.
Once progressed to my position proper, it was far easier for me to make correlations between production problems and related inefficiencies to systems design, maintenance, etc. Essentially I started asking questions no-one had asked before and started bridging the divide. Less than a year later,i had the mandate and ability to tear the whole system apart, causing the casualty of the most senior programmer there at the time (he didn't agree to anyone other than him touching his baby, so he left).
Long story short, in desperate situations where management have exhausted their known options, and are bound by higher powers, they will accept reasonably formulated and logical arguments - provided it actually solves THEIR problem. Stepping over the roadblocks above you to meet those issues face oh is a necessary part of that.
As far as I can tell Wall Street has always had a penchant for pedigree in addition to skills. Large companies are often the same. People used to talk about all software engineers needing CS degrees in the future. Again in Wall Street and large companies you'll see this, but there are still plenty of well paid, talented software developers with no formal CS training.
In my experience hiring and chatting with other people hiring data scientists, there's the same trouble as there is with software engineers. No matter how many people have the training there's still a dearth of applicants that are truly talented and can actually do things. PhDs fleeing academia for a promise of easy employment and money are a huge bulk of new data scientists that I've seen and most of them have a very hard time taking deep knowledge and applying it to solve real-world problems.
At least in tech I think the future of Data Science lies in the perpetually small group of people that will have a proven track record of coming into companies and actually solving problems, just as it has been in software development.
"There are only two kinds of languages: the ones people complain about and the ones nobody uses."
-- Bjarne Stroustrup
I've seen many programming languages become popular (or fail to) over the years and this quote has always seemed to hold true.
With Scala there was the first wave of people working in it that claimed that it was a grand panacea for all the problems in software (as is always the case with new languages).
However, it wasn't until people started to really claim that the language was certainly doomed that it clearly was a success.
In general I have found that Stroustrups quote is, counter-intuitively, a good way to determine whether the next hot new language will really stick or not. Furthermore, I have to admit that even some of my favorite programming languages fail this test and honestly these languages are extremely unlikely to ever achieve mainstream success.
And if you really think about it, it's not so counter-intuitive. Programming languages don't show their real limitations until you are very deep in a large complicated project. The frustrations of the beginner are never the same as the frustrations of an expert and only an expert can really feel that a language is "doomed". This sense of "doom" is often just the realization of the once language X zealot who now sees that this new language is not a true panacea. But this moment of disillusionment is also the moment an idealized programming language has proven itself a practical one. The more people that feel this loss of faith in their favorite new language, the more people are building large, practical, real-world software projects with it.
JavaScript historically has no way to do namespacing. In most languages this would be a deal breaker. But because JavaScript has lambda functions, closures and first class functions a solution could be crafted from scratch! Immediately-Invoked Function Expressions (IIFE) were one of the most powerful early techniques to create scopes on the fly in JavaScript. Without these tools from the lambda calculus you would normally have to rely on language level changes to allow these problems to be fixed.
The other early challenge of JavaScript was the need for asynchronous callbacks. Often these require passing around data that you might not have access to when a function is written. Lambda functions allow programmers to quickly write ad hoc logic. First class functions allow programmers to pass around this logic. And most importantly closures allow you to create functions on the fly based on data that is not known until an asynchronous callback is applied. Again without the core ideas of the lambda calculus in place this type of power would require significant language-level design changes.
10 years ago JavaScript was a hideous language, with many major issues. But because in this mess was contained the power of the lambda calculus the language was able to be salvaged and extended to a wide range of uses.