AI SUPERPOWERS : Book for those WHO THINK about OUR FUTURE

A few weeks back, I came across the book AI SUPERPOWERS by Chinese author KAI-FU LEE.  I think it’s about 250 pages that anyone, who works in the field of data analytics should read (or at least think about it). It’s one of those books that are best when you read them yourself. Therefore, I will try to keep my review a reasonable balance between teasing and the feeling that you already know all what the book should tell you.

AI SUPERPOWERS offers many points to think about. I personally counted at least 20 (!) thoughts that I realized that I haven’t thought about yet. However, before  outlining some of them, we should explain who the author is. Kai-Fu Lee is a Taiwanese man who worked for 35 years in the field of artificial intelligence. He started voice analytics for Apple, set up a Microsoft research center in Asia and, as CEO of Google of China faced the dilemma of establishing Google in a country that does not necessary celebrate its existence. He also manages venture capital funds to develop AI solutions. Kai-Fu Lee is a rare combination of experience with state-of-the-art AI approaches from Silicon Valley and a typical Asian “cautious overview” that does not accept simplifications as well as does not need adhere to cult of America is Great. He praises where he sees real mastery and pinpoints hardly pretending and unwarranted stereotypes.

The reason, I think you should read this book by yourself, is that among official lines of the text you will likely find your own inspirations (as it was with me). The book is a busy tree where everyone can choose “how much they sit” on each branch. However, in essence the book is a cocktail of three complementary streams (some of which you would not expect to appear judging by the very book title) :

The first stream (most in line with the name of the book) describes developments in the field of artificial intelligence. It contrasts how diverse the paths to the sophisticated analytics were for the US and China. Taiwan and Hong Kong Hong Kong have a bond with China, but their relationship is not, ehm, optimal. (I have a colleague from Hong Kong who narrates often about it in detail.) So, Kai-fu Lee’s position is not a pink ode on the Chinese model. Quite the contrary, it offers a very balanced view of where China stands in AI area and where lags behind US. As he had experienced both environments, his comparison is a valuable counterweight to the general propaganda both for and against China.

The second line is the author’s personal account of how (thanks to the cancer he managed to overcome) he changed his view on the direction in which artificial intelligence should go. The story of a seriously ill man, who at stair to possible end of his life completely alters way of thinking, is almost a cliché in our culture. But if you manage to be stay less cynical and you will close your eyes on this emotional aspect of the story and focus in reading this section rather on his conclusions, it becomes an inspirational reading.

The third stream of the book was a bit of a surprise to me. But a pleasant one. Being tricked by the title of the book, I did not expect author to try to extrapolate AI trends and describe what awaits us. The focus of this last part is similar to (for me the hilarious) SuperIntelligence book, so very inspiring read. However, as AI SUPERPOWERS came out later, it is already looking at some future aspects of  AI richer, with result of the first experiments (eg. with UBI) and, hence, in more specific narrative.

However, in order not to just scratch the surface of this masterpiece, let me offer you few specific inspirational ideas that this book has brought to me. I believe they might be the right “teaser” to actually read the whole book:

Copy-cat China. The book clearly and detail depicts that China has reached its peak in economic significance by copying foreign products. Therefore, author bluntly admits that in industrial production and design of material things, China is certainly not a ruling world power, but rather an “embarrassing copier”. However, the development of online services, AI and data analytics has gone through completely different story. As a result, the latest Chinese advancements in AI and digital services are a still bit in shade of “copy-cat sticker” of the past. But book clearly explains that it would foolish and outright dangerous for the external world to keep perception China in this mental illusion.

From cash directly to App-pay. In some parts of Africa, they have long lagged behind in building a fixed line network, so many areas have been cut off from the world. However, suddenly, with the advent of mobile networks, they could skip the landline stage and get access (to internet) directly via mobile network. A similar episode took place in China in the area of payments. In China, credit cards have never  properly settled as a form of payment. And when e-commerce was launched, the market has jumped directly on in-app payments like We-chat or Alibaba.

4 AI development forces. Like any effort, even the development of artificial intelligence has its own factors that can accelerate or hinder it. In the case of AI, the following 4 dimensions seem to be relevant: a) Computing power in the form of HW, b) sufficient human talent, c) volume and quality of data you have for training AI, d) business underpinnings for implementing developed solutions. At the same time, the extent to which these 4 factors are fulfilled by a particular country predetermines what role that country should take in applying the AI. I also used this knowledge to prepare an AI strategy for Slovakia, which construction of I had the honor to participate in.

The status of your phone’s battery. When predicting phenomena, you should use all the available inputs and be aware of whether you are not limiting the possibilities of AI by your very own prejudices. The book gives some great examples on this subject, most of which I liked how the usual state of your mobile battery is related to your discipline of paying off financial obligations. My regular readers know I’m a strong promoter of Feature engineering and data riddles, so I really enjoyed this part.

Probably this much, Your Honor. In many areas, AI will serve as a human counselor. Medicine is often discussed, but Justice is more of the Taboo so far. Artificial intelligence can also be helpful in this sensitive area without machines deciding on us. There are already systems that search within historical court records to detect false testimonies of witnesses, contrasting the information given in previous legal litigation. Moreover, AI can provide inputs to calibrate the severity of penalties for the same criminal acts (using scatter-plots between particular aggravating/attenuating circumstances and the length of the sentence periods to see if the proposed sentence is too strict or moderate).

Autonomous car(t)s. The discussion of self-propelled vehicles is primarily zooming on autonomous cars. However, there are much simpler implementations that are both not as dangerous and show much more of immediate, mass-use potential. These include shopping carts, for example. They could be programmed to follow you (and stop whenever you turn to fetch something) or even set themselves the fastest route through supermarket, depending on where your shopping lists items are located in store.

Hold on, I’m sending you a drone. The second implementation of self-propelled vehicles, which is simpler than cars, are flying things. No, this is not hype about drones, but really there is more space in the air and lower chance of collisions than on the road. We might not realize it, but the planes were equipped with autopilots sooner than cars. We also have pilot-less attack aircraft, but not unmanned tanks or warships. Therefore, one of the nearest ways of AI use will be unmanned rescue units that will be able to extinguish fire or rescue people even in the exposed terrain, without compromising the lives of the helicopter or aircraft crew.

O2O, the key to platform success. Online-To-Offline (O2O) is a concept where you start a service in the online environment but at its end there is material fulfillment in the physical world. Examples of such services are E-commerce, Uber or Booking.com. Markets that offer o2o products are more tangible to people than pure virtual services (self-learning courses, online software business). People feel the physical dimension of such a service. Therefore, we are also more willing to pay for such a service (such as pizza delivery), while services such as online tax advice are only slowly collecting their enthusiasts.

What is different this time? Past industrial revolutions are often used as an example of how mankind has dealt with the harsh changes in the labor market. Thus, the optimists say that even AI will not be a disaster for jobs (by the way, a few words why Kai-Fu Lee is not so optimistic on this note). This book brings one interesting twist to this issue, the Deskilling paradigm. When attentively reading the history, we find out that the jobs, that sprung after the industrial revolution, required of workers shallower knowledge of the matter then their alternatives before revolution (weaver vs. weaving mill operator, mathematician vs. man with a calculator, …). This phenomenon is called Deskilling. The important question remains whether we are ready to admit such a development for healthcare professionals or teachers. To put in one sentence: In the AI Industrial Revolution, the vocations at stake are those where the credibility of the profession is linked to the human factor.

Bigger surveillance, not weaker one. Due to the accumulation of data and some other factors, AI services have a greater tendency towards  monopoly than in other economy sectors. (anybody, Google?) It is, thus, important that the AI industries are subject to stronger rather than the weaker (antitrust) regulation than traditional industries. However, the states are lagging behind both in legislation as well as competency to steer them. There is no clear idea how to regulate services such as Facebook and public authority, at the same time, lack educated employees to supervise them first place. It feels almost like if there was no one with medical education in Health Care Supervision Office.

If you are interested in any of the topics, I encourage you to read the entire book. It’s really worth it. If you’re still wondering if it’s a good (time) investment, check out the Kai-Fu Lee’s video, where he talks about some parts of this book.

Two BigMac’s and one Big Data along, please.

Finally, we got on turn in the queue. “Two Big Macs and one Big Data along it, please,” says my colleague nonchalantly. The girl behind the counter is obviously stunned, her glance jumps alternately from one of us to another. She is balancing somewhere between misheard the second part of the order or worried that she had not yet studied the entire menu of the restaurant properly. Then she gently flushes and nods in confirmation. With huge effort we fight back laughter to keep us from revealing.

– – –

McDonalds_BigMacThis was a joke that we tried a few years back at one of our McDonald’s visits. Putting Big Mac and Big Data into context was really a prank. How it finally worked out you will find out at the very end of this blog. However, what served to be teasing joke back then is no longer laughable today. Even as straightforward business, as  fast-food undoubtedly is, starts to discover the nooks of data analysts and applications of artificial intelligence.

According to WIRED, this fast-food giant has decided to buy Israeli firm Dynamic Yield, which specializes in machine learning algorithms supporting Sales and Customer service. If the bare essence of the message has not raised your eyebrows, I add that this is the biggest acquisition McDonald’s has done over the past 20 years. Backstage expert information suggests that Dynamic Yield price was north $ 300 million, or about 7% of McDonald’s worldwide cash flow or 5% of global annual revenue for the past year! For comparison, it’s about as much as it costs them to build restaurants for all the Scandinavian countries combined. Perhaps, some of you will shake your head: What does McDonald’s sees in artificial intelligence, so that it is willing to “throw” such huge money on it?

While McDonald’s products are so standardized that they are often perceived as the cornerstone of simplicity,  you might wonder what is it there in McDonald’s business to analyze in such a depth. Some would fall for usual suspects of optimizing inventory logistics or the efficiency of hamburger and fries roasting. At least this is how we, the customers, see McDonald’s from behind the counter. Therefore, I bet you may be surprised, that the real reason for fast-food to chew into sophisticated analytics is the bare customer data. As McDonald’s is still severely limited by the physical number of different products it can offer you (contrary to Amazon, for example), client data is not primary helpful to make yet another upgrades or new versions of burgers. The way that they reported using Dynamic Yield technology is surprising different.

McDonald’s seeming data analytics bonanza is a drive-thru process. (…, which is not the focus of sales in most of the Europe, but is an important share of the overall market in the core WesternMcDonalds_DriveThrough markets. Thus, Big Data might not find immediate use in our region for some month, but in the US it has dire consequences). You may have noticed that most of the offers and promo banners in McDonal’s restaurant chain has lately  been transformed into digital displays. This change allows not only to speed up the exchange of new offers for the old ones, but allows also to personalize the offer for a particular customer. Certainly, with standard ordering directly at the counter in the restaurant, it doesn’t make sense because the “personalized” offer would confuse other clients waiting in the queue. But in drive-in it is all possible. So how will it all work?

Upon approaching the drive-through, their system attempts to recognize which customer it actually is. There are, metter-of-factly already several  alternatives to do so: recognizing of car number-plate, beacons fishing for your mobile device, credit card details, or at least unique enough combination of products you order. By identifying (or estimating) your identity, the system will then take advantage of the time you are waiting (for order to fulfill) and by using local weather data, info on nearby events and current menu item popularity of items, you have not spontaneously added to your order,  will offer you personalized coupons (which can be turned into purchase extension via one-touch purchase). What is more, if the system recognizes you before placing the order, it can even incorporate factors like duration meal preparation (to match your (in)patience profile), or the relative length of the queue versus the standard you experienced in similar part of the day in past.

However “banal” this might sound to you, properly personalized UP-sell offers normally achieve a 3-7% success rate. McDonald’s averages 68 million clients served per day. Adding to that fact that by speeding up McDonald’s own service might prevent some customers dropping from a slow-moving car queue, one can easily imagine that a 5% return on sales can be achieved almost instantaneously. Moreover, as the number of generated client coupon grows, the system learns to be increasingly targeted (and therefore more successful). If any benefits can be materialized from servicing (with AI help) “non-motorized” customers  inside the restaurant will be extra joker card of this project. Though face recognition systems are already tool in retail, so it will be not long before they prove their might in regular counter-serving of hungry guests. The greasy, fast-food mass production, as McDonald’s often see, is seemingly moving into a new era. Who would have expected so just a couple of years back?

– – –

Laughter finally beat us. We had tried our best to stay calm, but when the order tray landed in front of us, we couldn’t stand it anymore and burst into laughter. There were large fries laying besides the two big burgers. We quickly grabbed the burgers, leaving the ordering colleague with long, Big Potatoes instead of lunch. Well, he deserved it, after all. He ordered Big Data few years too early.

 

Why should Data Scientists BE SCARED ABOUT AI coming as well?

Debate on if Artificial Intelligence (AI) will slash some jobs (or entire professions) transformed from obscure omen reading into mainstream heated issue. Truck drivers, financial intermediaries and few other professions are nervously looking ahead if they gonna join red-listed endangered species. They certainly have good reasons to be worried …

… but have about Data analytics? Are Data Scientists on the AI replacement to-do-list as well? What a silly question, isn’t it? Ultimately, the Data Scientist are the oil of the AI solutions. Thus, they will be the ones eating others’ jobs and they do not need to worry about their own future. Or should they maybe?

 How sure you should (not) be

Few months ago, the economic expert commentaries were still shy in indications that the world economy state might deteriorate in quarters to come.  Back then it was a message per week on this issue. In last 2 weeks the matter got visibly more dramatic, the black omens pop up now on, literally, daily basis. As we learned from past, every economic crisis usually slashes substantial number of jobs opportunities on the market. In this sense the crises to come will be no different. We learned not to worry about that too much, as new jobs are recreated back, when the economy walks from crises back to good times. The problem is that on this account the nearest crises will be different. It will kill some jobs that will be never be recreated again.

Almost any week you can see some profession striking for salary increases. As the economy is booming employees push to reap a slice from the victory cake. However, there are some jobs where salaries kept on rising without any push from labor unions. Data Science is one of that areas where annual income has been on crazy adventure to the north. Driven by over-demand on (weak) supply, companies were raising the pay-level to swerve people from competitions (or motivate more people to get re-qualify to Data Scientist). But no more. Data from US (largest free Data science labor market) indicate that the entry salary of the Data Scientist stagnated in 2017 and corrected few percentage points down in 2018. The reasons for that is the price of Data Science talent got over the level fitting business case for their possible impact in company (to justify their pay). Not many people realize that higher remuneration of these years are last dances before DJ calls off the party.

In both cases, the strike- or surge-driven salaries will make the AI replacement scenario more severe. When we come out of the crises, the employers will be facing the dilemma if to rehire stuff again or to replace some part of it by automation. The higher the annual salary level of employees, the easier the case for AI solutions to be cost-saver. Especially, the area of super expensive (and still scarce) Data scientists offers a lot of room for rethinking, as one year cost of Data Scientist in US is, literally, 7 digit figure.

The (seemingly strong) peace of mind of the Data community about their jobs security has roots in fatal attribution error. For most of the manual jobs the replacement will come with automation, presumably intelligent computers running on data. Therefore, data processing industry might be perceived as the lubricant of the whole automation process. Hence, the strong believe that data scientists are on the right side of this transformation river. While data might, indeed, be the oil of the AI transformation, it is ill conceived that humans necessarily need to take part in extracting it. If we stick to analogy, most of the things on oil rig is not human labor force but automation itself. Similarly, the repetitive and easy-to-automate jobs in Data analytics will not be run by humans. If you take two steps back and impartially review the work of most of recent data analysts their work is much more well-defined and repetitive than driving of the autonomous car. Therefore, data community should not wall into trap of illusions, that AI job revolution will take detour from their domain.

Time for panic?

The omens are out there, time for panic? Well, we as humans were having difficulties facing the previous industrial revolutions. And we will probably struggle this time around as well.  Almost any time disrupting technology arouse in past, first answer was to push back by, literally, beating the machines. However, there are ways how we can face the AI job hunt properly. I have been invited to speak about HOW TO SURVIVE the (first) AI ATTACK on DATA SCIENCE JOBS at the DataFestival 2019 in Munich this week. This is a short teaser about the topic, and I offer you exclusive sneak-peek into

PRESENTATION >>> FILIP_VITEK_TeamViewer_SURVIVAL_TICKET

here as you are precious members our TheMighyData.com community. As this topic hits all of us, any comments or views from you on this topic are highly welcomed in comments to this blog or at info@mocnedata.sk ; Enjoy the reading and see you on some other event soon.

What (should) good Feature engineering look like?

We are living in era where Data analytics upgrade form mere main stream small talk topic to key business driver. Almost everybody tries to get better in Advanced analytics, AI or other data-based activities. Some do so from sheer conviction of its importance, some are forced by group think or competition moves. Let’s admit it, for most organizations and teams this is quite leap. So they, …

Leaning on methods

… lean on whatever literature or manuals there are around. I feel strange mixture of sympathy and disdain for them, while almost all self-learning courses and books on analytics primarily focus on methods of Data Science. You can read on which algorithms are suited for which tasks, how to program them or use open source packages to calculate them (as black boxes) for you. If you dwell in analytics for a bit more time, you probably smile already and know where I am heading with this point. Large part of the novices (and late- or returning-comers) fail to understand that the HOW (what AI method) is getting more and more commodity. Thus, the more you try to specialize in those categories the less prospective your future in analytics might be. Don’t get me wrong, I am far from suggesting that new data Scientist should (or need) not to understand the methods behind, quite the contrary. My point is that trying to get better at writing logistic regression or training decision tree is like trying to get better in potatoes digging in era of tractors. Machines got so good in applying the analytical constructs that it becomes Don Quijote-ish to fight with them. Algorithms are becoming commodities, even commodities that are free there in community. So where shall you channel your skill improvement in analytics?

Where to go?

From all the areas that are needed for successful analytics (see here), Feature engineering seems to be the most promising bet in attempt to make significant impact. While it is the first step of the analytical funnel, whatever defects (or data loss) you introduce there is carried over the whole analytical process. It is proven by Kaggle competitions that informational disadvantage is almost never compensated by “smarter” algorithm down the road of model training. What surprises me, though, that in ever mushrooming inundation of books and analytics, you find very little on Feature engineering. Ironically, it is difficult to stumble even upon a definition of what feature engineering should and should not do. Not even talking about best practices in this area.

That is exactly the reason I decided to sum up in this blog shortly what my experience in Machine Learning suggests that Good Feature engineering should include:

1] Extending the set of features. Whenever you start to build some advanced analytical model, you only start with some group of raw parameters (= inputs). As the Features are ultimately what the food are for humans, your live is much lively if you have both enough and variety of the food. To achieve that in analytical efforts feature engineering need to address make sure that you “chew” the raw inputs and get some aggregated features as well. For example, you might be having history of customer purchases, but it is also important to calculate what is the lowest, usual and highest ever amount purchased. You also would find useful to know if the customer was buying certain part of the year more or what is the time since his/her last purchase. These are all aggregates. As you can probably smell from the examples, these are often standardized (like MIN, MAX, AVG, …) and foreseeable steps to take, so good Feature engineering should include automated generation of the aggregates. However, besides the obvious aggregates, one needs also to create new pieces of information that are not directly represented in raw inputs. When you are trying to predict interest in buying ice cream for particular person, it is certainly interesting to know for how long this person has been already buying ice cream products and what is total consumption of it. But if you need to predict consumption in certain short time frame, the ratio of ice creams per week would be more telling then the total consumption metrics.  That is when it comes to transformations, cross-pollination of features and completely new dimensions. Therefore, good feature engineering should extend the set of features BOTH aggregated features and newly derived features.

2] Sorting out the hopeless cases. Plenty is certainly good start, because from quantity you can breed quality. Being open in step one but also means you will have a lot of hopeless predictors that obviously (from its design) can have little impact on prediction. There are many reasons, why some features should be weeded out, but let me elaborate on one very common pitfall. Imagine you have 5 different sofas from spartan one to really fancy ones. Your model is to decide which if the sofa versions will be most appealing to the customer. If you have parameter that has only two values (think male and female) and these are evenly distributed in the sample, it is difficult for this parameter to classify customers into 5 groups with just 2 values (Yes, I said difficult, not impossible). The other way around, if you have colour scheme of the sofas coded into 10000 colour hue and you have about 5000 customers in the sample, some colour options will not have even single customer interested so their predictive relevance gets very questionable as well. Feaure engineering should give you plethora of inputs but should also spare you from pointless cases.

3] Prune to have more efficient model training and operation. Some methods down right struggle under burden of the too many input parameters (think Logistic regression). Yet other can swallow the load but take ages to recalculate or retrain the model. So, it is strongly encouraged to allow models more breathing space. In order to do so one has to care for two dimension: You should not allow clutter from too many distinct values in one variable. For instance modelling health of the person based on the age is probably sensible assumption, but nothing serious will come out, if you count the age meticulously in number of days person had when accident happened, rather then in years (or even decades of years). So good Feature engineering should do (automatic) binning of the values. Secondly the model should not struggle with maintaining the parameters that bring little additional value for the precision of the prediction and it is role of Feature engineering to prune the input set. Good model is not the most precise one, but the one with least parameters needed to achieve acceptable precision.  Because only then you achieve quality through quantity maxim mentioned before.

4] Signal, if I missed anything relevant. Lastly but not least, good feature engineering approach should be your buddy. Your guide through the process. It should be able to point you to information space areas that are under-served or completely missing from your input set. If you are modelling probability to buy certain product, you should have features on how pricy the product is included. If you don’t the feature engineering system should remind you of that. You might scratch your head how would the system know. Well there are 2 common approaches how to achieve this. You can either define permanent dictionary of all possible features and create some additional layers/tags (like pricing issues) on top of the raw list. Upon reading the set you are just about to consider the system would detect there is no representative from the pricing category. If you do not have the resources to maintain such a dictionary, then you can use historical knowledge as the challenger. Your system reviews similar models done (by your) organization and collects the areas of the features used in given models. Then it can verify, if you covered all historically covered areas also in your next model. Though this might sound like wacky idea, in larger teams or in organizations with “predictive model factories” having such a tool is close to must.

We went together through requirements that Good feature engineering should include. Now that you know, you can return to drawing board of your analytical project and think about how to achieve that. If you happen to work in Python you can get a bit more inspiration on this topic from my recent presentation at AI in Action Meetup from Berlin.

 

Berlin Meetup: Cool Feature engineering [my slides]

AI_in_ACTION

 

 

 

 

 

 

 

 

 

 

Dear fellows, 

on Wednesday 20th Feb 2019 I have been invited to speak at AI IN ACTION Meetup organized by ALLDUS in Berlin. The topic was one of my favorite issues, namely Feature engineering. This time we looked at the issue from the of How To Do Cool Feature Engineering In Python. If you had the chance to be in the Meetup crowd and failed to note down some figure, or if you are interested to read about what were the ideas discussed even thought you were not there, here you can find attached the presentation slides from that MeetUp.

slides >>> FILIP_VITEK_TeamViewer_Feature_Selection_IN_PYTHON

If you have any question of different opinion on some of the debated issues, do not hesitate to drop me a few lines on info@mocnedata.sk ;

Social Media have too much of Diesel

This is not an ecology section of TheMightyData.com portal. Neither is this blog about if Facebook or LinkedIn drive electric cars or classical combustion engines. The issue is more serious here, our privacy and its handling are at stake. Well, just make your own call on this:

Indisputably the biggest scandal of the automotive industry was the Diesel Gate. In this well protracted case car producers modified the software in the car in a way that it runs more efficiently (read with lower engine output) but only in situation when car computer realized it is docked to emission measurement device. As a result the congestion smog emitted by car was much lower during the test compared to values from real drive out there on the roads. While most of the countries only realize emission tests in technical review sites (enabled with those dockers), diesel cars appeared to be more ecological as they truly are.

Important twist of this scandal was fact that Automotive (by car software manipulation) created environment where they were the only guards to themselves. Being your own judge does not necessarily mean you will cheat. But if you add strong competitive fight and cheaper market entrants the odds of the misbehaving are rising. Therefore, often the only factor swerving from fair customer treatment to fraud is the management integrity. In retrospect we know that automotive managers failed to hold the principles railing.

If the manipulated cars were passing one emission test after the other, you might wonder: How, on Earth, could they have been caught? The punchline of the story is actually very interesting and carries over the lessons learned for social media that are mentioned in blog headline. The misbehaving of the car producers was detected by NGO which is testing the car economy in real life usage. Their emission readings were for Diesel cars order of magnitude off the  laboratory (docker) measurements, while regular gasoline engines’results were quite close to official numbers. That rose suspicion or experts and tipped the avalanche of revelation of these fraudulent practices. But how does this relate to social media?

Recent social media businesses are just before similar period that car manufacturers were. When it comes to Fake news control of hoaxes eradication, they are both the messenger of the news and the quality (and user impact) watchdog. It is only subject to their inner moral integrity how well they gonna police the standards. What is more, they are in similar position when it comes to our private information and it (commercial) usage. Even though their business mode is built on monetizing information of its own users, there is next to none regulation setting the limits or reason or punishing greed of those Goliath’s. Cambridge Analytics issue was the the poster-child example of what we are talking about. If you think that GDPR has brought some justice to the topic, look at how pathetic the improvement of some companies are. When it comes to XAI (Explainable AI, depicted here) there is not even an general guideline proposed.

Therefore, the answer to this treacherous mode might be similar as was in Diesel Gate. After the emission full outbreak, honest automotive companies desperately called for independent agencies measuring the real emitted levels of gases in real day-to-day car driving to run the public test of actual emissions. They realized that their business (and brands) are user trust dependent. Car, after all, is The tool that we all put our lives and lives of our families into chance. Thus, line of though such as “If they had cheated about such banal thing  as emissions, what else from security features could they have lied about?” is dangerous rope to balance on. Unfair ecology treatment (let’s face it most of us is ignorant about) might easily turn into customer mistreating Wolkswagen (or other brand). So the prevention of repeating of the fraud, has been guarded by free third parties.

The problem of social media is that they still are in Diesel phase.  They did bot understand/admit the value of third party auditing. Mark Zuckerberg (and other social media executives) are talking us down with statements that best prevention is to hand over the fake news fight to internally developed detection routines. They do so also in midst of several blunt failures about conspiration theories or scandals of mistreatment of privacy data of own users. Social media companies simply don’t realize that they risk scenario that hit the automotive industry. They also neglect fact that if Facebook (or social network) users feel like they are being lied or bullied about own privacy data,  users will trigger massive exodus from that platform. If you count on human laziness to prevail and people swallowing their concerns, you better know that 1] for forthcoming generation the Facebook is already not a first social media choice; 2] Similar piece of mind about high (mental) switching cost was held by mobile operators, banks or utilities and they have to struggle to keep their user-bases.

Social media still have too much of the Diesel. If you want to benefit out of this, don’t go selling electric cars to Zuckerberg, but rather design for them algorithms to evaluate their work with user data. Or launch a new social media network that is going to have transparent and accessible audit of user data handling built directly into core functions from very beginning. Because social media will also go through their Diesel Gate, And probably pretty soon …