Strategy

Tech strategies to compete and win

Planning for a post-work future

It’s that time of year where many bloggers make their predictions for next year. Rather than do that, I wanted to look a generation out, when those who entering college today send their children to college, and think about the events of automation on our future. This is not a prediction per se. Instead it is more of a RFI (a request for ideas).

As a caveat, I work in automation (machine intelligence and machine learning, sensor and computer vision, automated controls and planning systems). I also have a prior background in policy—one that is driving me to think about the bigger picture of automation. However, this post is not about the work I am doing now. It is about the near-term “practical realities” I can imagine.

We are at the onset of an “Automaton Renaissance.” Five years ago, most people outside of tech thought about self-driving cars as something from the Jetsons. Last week, the governor of Michigan signed a bill allowing live testing of self-driving cars without human testers. Chatbots are not just the stuff of start-ups. Last month, I attended a conference where Fortune-500, large-cap companies were sharing results of pilots to replace back office help desks and call centers with chatbots. Two weeks ago, I was at a supply chain conference where we discussed initial pilots that a replacing planning and decision-making with machine learning (pilots involving billions of dollars of shipments). Automation is not coming—it is here already, and accelerating. Last week, I was at a conference for advanced manufacturing, we the speakers discussed the current and future impacts (good and bad) on jobs in the US.

So what will life (and work) be like in 20 years? Here are just a few things that we already have the technology to do today (what is left is the less-futuristic problems of mass production, rollout, support and adoption):

  • If you live in the city and suburbs, you will not need to own a car. Instead you can call an on-demand autonomous car to pick you up. No direct insurance costs, no auto loans, less traffic and pollution. In fact the cars will tell Public Works about detected potholes (street light and infrastructure sensors will tell Public Works when maintenance is needed).
  • If you work in a manufacturing plant, you will have fewer workers who are monitoring and coordinated advanced manufacturing (automation + additives). The parts will have higher durability and fewer component suppliers—also a reduction in delays, cost and pollution.
  • If you work on a farm you will demonstrate (supervised learning) to drones how you want plants pruned and picked, holes dug, etc. These drones will reduce back-breaking labor, reduce accidents and automatically provide complete traceability of the food supply chain (likely via Block Chain)
  • If you do data entry or transcription, your work will be replaced with everything from voice recognition-based entry, to Block Chain-secured data exchange, to automated data translation (like the team is doing at Tamr)
  • 95% of call centers will be chatbots. Waiting for an agent will be eliminated (as well as large, power-hungry call centers). The remaining 5% of jobs will be human handling escalation of things the machines cannot.

These are just five examples. They are all “good outcomes” in terms of saving work, increasing quantity and quality of output, reducing cost (and price), and even improving the environment. (If you are worried about the impact of computing on energy, look at what Google is doing with making computing greener.)

However, they will all radically change the jobs picture worldwide. Yes, they will create new, more knowledge-oriented jobs. Nevertheless, they will greatly reduce the number of other jobs. Ultimately, I believe we will have fewer net jobs overall. This is the “post-work future” — actually a “post-labor future”, a term that sounds a bit too political. What do we do about that?

We could ban automation. However any company or country that embraces it will gain such economic advantage that it will force others to eventually adopt automation. The smarter answer is to begin planning for an automation-enhanced future. The way I see it, our potential outcomes fall between the following two extremes:

  1. The “Gene Roddenberry” Outcome: After eliminating the Three D’s (dirt, danger, and drudgery) and using automation to reduce cost and increase quantity, we free up much capacity for people to explore more creative outcomes. We still have knowledge-based jobs (medicine, engineering, planning). However, more people can spend time on art, literature, etc. This is the ideal future.
  2. The “Haves vs. Have Nots” Outcome: Knowledge workers and the affluent do incredibly well. Others are left out. We have the resources (thanks to higher productivity) but we wind up directing this to programs that essentially consign many people to living “on the dole” as it was called when I lived in the UK. While this is humane, it omits the creative ideas and contributions of whole blocks of our population. This is a bad future.

Crafting where we will be in 20 years is not just an exercise in public policy. It will require changes in how we think and talk about education, technology, jobs, entitlement programs, etc. Thinking about this often keeps me up at night. To be successful, we will need to do this across all of society (urban and rural, pre-school through retirement age, across all incomes and education levels, across all countries and political parties).

Regardless of what we do, we need to get started now. Automation is accelerating. Guess how many autonomous vehicles will be on the roads in the US alone by 2020 (virtually three years from now):

10 million

Note: The above image is labeled for re-use by Racontour. Read more on the post-work word at The Atlantic magazine and Aeon magazine.

Data Scientists vs. Data Engineers: Facts vs. Interpretation

Some of the things we build at work are closed-loop, Internet-scale machine learning micro-services. We have created algorithms that run in milliseconds that we can invoke via REST calls, thousands of times per second. We also have created data pipeline processes that process new (mostly sensor) data and build and publish new models when critical thresholds are reached. This work requires the collaboration of two very in-demand specialists: Data Scientists and Data Engineers.

Contrary to the classic Math vs. Coding vs. Domain Expertise Venn diagram, Data Scientists and Data Engineers share many similarities. Both love data. Both have domain expertise. Both are great functional programmers. Both are good at solving complicate mathematical problems—both discrete and continuous. Both use many similar tools and languages (in our case, Spark, Hadoop, Python and Scala).

However, over the past two years, as we have improved the collaboration between each to build better machine learning services, we have some key differences between each role. These differences are not just based on skill set or disposition. They also include differences areas of responsibility that are essential to creating fast, scalable, and accurate machine learning services.

It is easy to muddle raw data from fully deterministic derived data from algorithmically derived data. Raw data never changes. Rules may change but are easy to manage with clean version controls. However, even the same deterministic algorithms can produce different results (one example: whenever you refit or rebuild a model using new data, your results can change). If you are building algorithmic services you need to keep everything clean and separate. If not, you cannot cleanly “learn” from new data and continuously improve your services.

We have found a very nice separation of responsibility that prevents muddling things:

  • Our Data Engineers are responsible for determinist facts
  • Our Data Scientists are responsible for interpretation of these

This boils down to this: determinist rules are the purview of engineers while algorithmic guesses come from scientists. This is a gross simplification (as both engineers deal in many, many complexities). However, this separate keeps it very clear, not only in determining “who does what” but also preventing errors, guesses, and other unintended consequences that pollute data driven decision-making.

Let’s take Google Now’s “Where you parked” service as an example. Data Engineers are responsible for processing the streaming sensor updates from your phone, combining this with past data, determining motion vs. at rest, factoring out duplicate transmission, geospatial drift, etc. Data Scientists are responsible for coming up with the algorithm to determine whether your detected stop state is a place where you parked (vs. simply being at work, at home, or at a really bad stop light). Essentially, Data Engineers capture and process the data to extract required model features while Data Scientists come up with the algorithm to interpret these features and provide an answer.

Once you have separation down, both teams can collaborate cleanly. Data Scientists experiment and test algorithms while Engineers design how to apply at scale, with sub-second execution. Data Scientists determine what approach is used to build models (and what triggers model optimization, build and re-fitting). Data Engineers build seamless implementation of this. Data Scientists build algorithm prototypes and MVPs; Data Engineers scale these into fast, reliable, services. Data Scientists worry about (and define rules) to exclude outliers that would wreak havoc on F-tests; Data Engineers implement defensive programming and automated test coverage to ensure unplanned data does not wreak havoc on production operation.