The Mighty Data launching NEW AI PODCAST | mAIndset

📢 For years, we have with Dávid Tvrdoň searched for an AI podcast that is more than just summary of news like, “AI model X can now do Y.” We craved something deeper—why it’s happening, what exactly it means for us, and how it shapes our communities. As we have not found anything like that (in CEE space), we decided: “If no one else does, we shall!” Thus, we gathered our brainpower (and a fair share of funny life anecdotes 🤣 ) to create the first AI podcast in the CEE region that goes beyond the headlines, one that gives you context and makes you think. AI is already transforming the World around us, but many people barely notice it. So yes, a CEE AI podcast? Here it is, folks. 😎

🤔 Why “mAIndset“? Well, our mission is straightforward: “Shape What You Know and Think About AI.” We don’t just want you to stay informed—we want you to truly understand AI. From ethics to job market impacts to practical tools for daily life, we’ll cover it all. But to stay on that mission, we need you! No radio silence, please! Vote on topics you’d like us to cover—what fascinates you about AI but hasn’t been properly discussed? 🤓 Share your ideas (or upvote already existing suggestions) here:
https://lnkd.in/ewUmuq4H.
Future episodes will prioritize the themes that get the most votes.

🤷‍♂️ We made the podcast in English, though it is not a mother tongue for either of us, as courtesy of AI hungry audience beyond our geo area. Some local language episodes are in the oven already and we’re genuinely curious how strongly you’d prefer localized episodes or a mix over EN version. 👀 So drop us feedback (not only) on this at ideas@maindsetpodcast.com .

🎙️ And now – drumroll, please 🥁- our first episode! It’s titled “What You Are Not Ready For In 2025,” or, in other words: “AI topics that might scare you now but will soon be your new reality.” (So you better know them ahead) We’re diving into predictions for 2025 that haven’t made the mainstream news but are crucial. No boring “AI agent wrote an email” fluff—we’re talking about how AI will reshape businesses, relationships, jobs, and life itself. Check out the full episode here (video version):
https://shorturl.at/iZCEq

⛑️ To keep this podcast alive, we need your help! Give us a 5-star review (minimum, or David might start sending you AI-generated haikus about guilt 😂), subscribe to the podcast, and most importantly, share it with a friend or colleague interested in AI or tech. So get and hit those “Share” and “Send” buttons, folks. We hope you’ll enjoy the first episode(s). 🚀 LET’S GO!

Will all Coders end up as Policemen, Fire brigade or in Ambulance crew ?!

As an AI expert, I spend a lot of time pondering the future of work in tech. Recently, I’ve been struck by an intriguing parallel between the world of autonomous vehicles and the evolving landscape of software development. Buckle up, fellow code jockeys – we’re about to take a wild ride through the future of our profession!

The Last Stand of Human Drivers

Picture this: it’s 2040, and the streets are filled with sleek, silent autonomous vehicles. Humans lounging in their cars, feet up on the dashboard, while robots do all the driving. Sounds great, right? But wait – what’s that piercing siren in the distance?

That’s right, it’s the unmistakable wail of emergency vehicles – police cars, fire trucks, and ambulances. These special-purpose vehicles are likely to be the last bastions of human control on our roads. Why? Because when lives are on the line, we still need that human touch.

Think about it: a police car chasing down a suspect might need to break traffic laws, take unexpected shortcuts, or make split-second decisions that no AI has yet mastered. A fire truck may need to plow through obstacles or navigate burning debris. And an ambulance? Well, let’s just say that weaving through traffic at breakneck speeds while keeping a patient stable is not for the faint of heart (or the cold of processor).

A Significant Slice of the Workforce Pie

Now, you might be thinking, “Sure, but how many people actually work in these professions?” Well, buckle up for some number crunching!

According to the U.S. official stats, as of 2020:

  • Police and detectives: about 708 000 jobs
  • Firefighters: approximately 1,041,200 jobs
  • EMTs and paramedics: around 302,743 jobs

That’s a total of nearly 2.1 million jobs in these three fields alone. And that’s just in the United States! Globally, these numbers are much, much higher. We’re talking about a significant chunk of the workforce here, folks.

Coders: The Emergency Services of the Digital World?

So, what does this have to do with us code monkeys? Well, gather ’round the water cooler (or should I say, the Stack Overflow forum), and let me spin you a tale of the future.

As AI continues to advance in leaps and bounds, more and more coding tasks will be automated. But just like those emergency vehicles, there will always be a need for human coders to handle the complex, unpredictable, and high-stakes situations that AI just can’t manage.

Let’s break it down:

  1. Code Cops: Just as police officers uphold the law, we’ll need sharp-eyed developers to police our software ecosystems. These digital detectives will hunt down security vulnerabilities, chase after elusive bugs, and keep our cyber streets safe from the ne’er-do-wells of the coding world.
  2. Code Firefighters: When a critical system goes up in flames (metaphorically speaking, of course), who you gonna call? That’s right, the code firefighters! These brave souls will dive into the burning wreckage of crashed servers and melting databases, armed with nothing but their wits, a command line, and possibly a very large cup of coffee.
  3. Code Paramedics: Sometimes, code doesn’t crash – it just gets really, really sick. Enter the code paramedics, ready to perform CPR (Code Pulse Resuscitation) on ailing algorithms and patch up bleeding edge technologies. They’ll be the ones making house calls to tech startups at 3 AM when the latest AI model starts hallucinating cat pictures instead of stock predictions.

Preparing for Your Future in Digital Emergency Services

So, how can you, dear fellow developer, prepare for this brave new world of coding? Here are some tips to future-proof your career:

  1. Embrace complexity: While AI might handle the routine tasks, humans will still be needed for the gnarly, interdependent systems that no machine learning model can fully grasp. Dive into distributed systems, machine learning operations, and other complex domains.
  2. Develop your diagnostic skills: Just as a good EMT can quickly assess a patient’s condition, you’ll need to sharpen your ability to rapidly diagnose and triage software issues. Practice debugging under pressure – maybe set up some coding escape rooms?
  3. Master the art of improvisation: Emergency responders often have to think on their feet. Start exploring creative problem-solving techniques and participate in hackathons to hone your ability to code under pressure.
  4. Cultivate your people skills: In emergency situations, clear communication is crucial. Work on explaining complex technical concepts to non-technical people. You never know when you’ll need to talk a panicked CEO through a system reboot.
  5. Stay fit (mentally and physically): Emergency help does not come in pre-scheduled time slots. Thus, unusual coding hours or places might become the norm. Start building your endurance now – both in terms of sustained problem-solving and the ability to subsist on nothing but pizza and energy drinks for days on end.

In the end, while coding as we know it may change, there’s always going to be a need for people who can step in when things go wrong. Whether you end up as the policeman of software, the fireman of coding, or the ambulance crew for dying code, rest assured: your problem-solving skills will always be in demand.

So, coders, get ready to suit up. The future may be full of AI drivers, but when the digital world hits a pothole, someone’s going to have to show up with the sirens blaring—and that someone might just be you.

Now, if you’ll excuse me, I hear the faint sound of a server crashing in the distance. This needs a code cop to go investigate. Over and out!

Data Executive’s Read 2023 | Book suggestions

Staying sharp in the data realm is like juggling flaming laptops – challenging and a tad risky. To keep my executive skills from going the way of the floppy disk, I’ve committed to tackling a whopping 10,000 pages of books annually. Like private brain gym, but with more words and fewer sweaty towels. (Not only) for executive, reading 300+ pages book is a large time investment, so you better pick a worthy one. Therefore, below I( offer list of this year’s best reads in 2023, curated to inspire, educate, and maybe even give you a chuckle. Think of below listed books an potential beacon in maze of staying tuned to data wizardry!

 

Blue Ocean SHIFT

Topic | Innovation, Strategy

If you ever went through some Strategic management training, this name might ring the bell with you. You also might roll your eyes, as Renée Mauborgne and W. Chan Kim published their first introduction to Blue Ocean in 2004, so whooping 20 years ago. But wait I am not that ignorant, there is more to this suggestion.

Blue Ocean strategy (BOS) is one of the major concepts in strategy how to differentiate your business from (blood thirsty, break-the-neck) competition. It is framework that enables you to innovate no matter how good/bad or unique your products or services are. If you have not read this book before, close the gap immediately. I used it several assignments of my career and the methodology always yielded interesting new business strategies.

However, even if you did read the original 2004’s Blue Strategy book, this one is different. Authors of the original concept bring additional insights how to not only design the differentiating strategy, but foremostly also how to implement it. They added and rewritten original scope of BOS based on learning from 20-years of implementing it in industries and public organizations. Hence the updated name reference to “SHIFT” in Title. I honestly think, this is a must read for any middle or top manager.

Link | https://www.amazon.com/Blue-Ocean-Shift-Competing-Confidence/dp/0316314048

.

 

AI 2041

Topic | Sci-Fi Fiction + AI commentary  

Many authors and books try to explain the major shift in ArtificialIntelligence (or AI) in last days. Few writers also dare to predict or speculate about where it might takes us from here further.

️ However, the book from Kai-Fu Lee and @ChenQuifan is very special and different. Kai-Fu is formal Executive from Google, Apple and likes, responsible for implementing AI solutions. When he talks AI methods, he most likely headed implementation of the early pilots of that. Real well of AI knowledge and experience.

He teamed with Sci-Fi author to write unique piece narrated by dozen of stories (all happening around year 2041). In each story/chapter they first introduce the future use of AI in real life, only to finish the chapter with facts and details of how this will be implemented and what is the realistic stage of future AI to expect before 2041.

The book is somewhat thick, but absolutely worth and easy to read, as you can dig through it one story at a time. I think it is especially good gift for somebody who wants to understand the (future) of AI, but does not have technical background to read white papers.

Link | https://www.amazon.com/AI-2041-Ten-Visions-Future/dp/059323829X

.

Becoming a Data Head

Topic | Data-driven, Management, Data literacy

Decision to put this book on my reading list was stemming from the curiosity.  The book reviews suggest that this book is good entry-book for executive trying to be data-driven or AI-ready. Being SVP Data & Analytics (and seasoned Data Scientist) myself, hardly the fit for my career phase. But I have seen so many books claim (and fail) to introduce you to Data Science bushes, that I was tempted on how this book will be doing? Yet another flat-falling promise?

No, quite the contrary! This book really walks its talk. Namely walks you as user through different stages of Data analytics and Data Science smoothly. Even the basic concepts are explained in no-nonsense style that does not require any previous knowledge from you, but also does not insult (your intelligence) neither gets you bored, if you are reading things already obvious to you. You can also decide how “far into the woods” do you want to dive and stop reading any time you think this is exactly the level of understanding that is enough for you. Or maybe you look even deeper to understand the principles of what you just read?

I strongly recommend this book for anybody trying to change career into data jobs. I find it also great present for any manager or executive if you want to enlighten them in data.

Link | https://www.amazon.com/Becoming-Data-Head-Understand-Statistics/dp/1119741742/

.

COLORWISE

Topic | Data Visualization, Storytelling

As somebody shaping (literally) thousands of visualization year after year, I welcome books describing the rules and good (and bad) practices for creating visualizations. I have few in my library (and suggested them in my previous reading lists), but they often talk more about what kind of graph to chose and how to shape the composition. Many of them take use of color for granted (or touch the issue only from the side).

The ColorWise is book giving “color choice” and “color coding” in graphs and visualization full spot-light.  It explains the background of colors in very non-academic way and surely taking you beyond your previous knowledge about color usage. It also gives clear guidance on how to create your graph color schemes, if you are anchored with some of the brand (must-have) colors. What is more, it goes also deeper into psychology of different color schemes and warns you about cultural or color deficiency pitfalls of your graphs. If you are already pro, you will often nod your head with “Exactly!” on your lips … and you still learn few new aspects to think about. If you are “regular” color user, your color coding skills will take significant boost. I strongly recommend for anybody , who needs to produce dashboards or presentations regularly in their work.

Link | https://www.amazon.com/ColorWise-Storytellers-Guide-Intentional-Color/dp/1492097845

.

BUILD

Topic | Strategy, Data, Product management

Many admire TonyFadell for what he achieved. He built iPod for Apple and basically saved Apple from falling. And then humbly he built iPhone on top. And if that would not be enough for you, then he built the brand new company Nest that started the whole SmartHome category of technology and sold it to Google for few billions. So certainly inspiring person enough. But if you are not a tech geek, you probably did not hear his name before or care too much. Nor did I. And I regret so.

His book BUILD is interesting mixture of advice and guidance for people who want to have their life (and career) a bit more in their hands. He narrates the story from the adolescence through earlier years in job up to CEO-part of your life. And yes, maybe you will never (want to) be CEO, but the story is still a good guidance. It might sound fluffy, but whoever you are in business, I am quite sure you can take some benefit from some chapter of this book. Yes, occasionally you have to pardon him Tony’s American optics, but the smell of it is more like fragrance you know, but would not wear yourself, not a sensoric disgust.

‍ I especially admire a chapter on how data plays different role in building individual phases of the product. It gives you clear idea guidance on where data is horse and where it is (still needed but rather) cart. Going through 3 layers of management (Team Lead to SVP) myself, I can confirm that his views of how to perceive your role is very accurate and I was amazed how he can compress the essence into (often just) few pages of the text.

All in all, this book is Masterpiece (uh, I told you that already, right? ). And I strongly suggest you to read it. The earlier the better. Because some of the lessons he gives I had to learn hard way and I only wished he had written that book earlier. Have a great read!

Link | https://www.amazon.com/Build-Unorthodox-Guide-Making-Things/dp/B09CF2YB6Z/

.

All in on AI

Topic | AI, Growth, Strategy

I have read most of 15 books that @DavenportTom authored and mostly were happy about them. Therefore, when I saw his newest piece ALL IN ON AI, I was full in anticipation.

Author introduces group of businesses that decided to make artificialintelligence the center piece of their business strategy and operation. They really went ALL-IN on it. Book walks you first through how does such a AI-ALL-IN company looks like. What are common denominators, but also industry specific aspects. Quickly you understand how to spot the markers.

But that’s only start of it. In the remainder of the book Davenport (and his co-author) provide examples of how to your existing business into AI-ALL-IN state. They do it cleverly, picking real companies (‘ stories) from different maturity levels and industries. Authors also methodically link the needed AI-markers to the development in the stories, proving that common denominators are actually fitting and well chosen.

Who is this book for?
Well, for anybody who envisions or dreams about taking benefit of progressive technologies in their work. For those wanting to step-up or future-proof their business.
It’s also good gift idea for employees trying to pitch the AI change to top manager(s).

Link: https://www.amazon.com/All-AI-Companies-Artificial-Intelligence/dp/1647824699/

.

Good Data

Topic |  Data, Ethics, Search data

Reading Sam Gilbert’s book Good Data is stimulating and entertaining at the same time (you just need to see through authors masked humor). Sam is seasoned data professional, who does not fall into cliche and mental short-cuts oof today’s data speak.

Not always had I agreed to his opinions, but all the questions he raised in the book made me really (re)think what I considered role of data to be in different corners of business and our society. Thus, if you ask “What questions should we have about future of data?” , this book will get you there.

.

Just for the answers to those questions, please, think a bit more critically than the author suggests. All in all, quick and fun to read, opening new horizons. Worth few days of reading.

Link | https://www.amazon.com/Good-Data-Optimists-Digital-Future/dp/1787396339

 

.

Don’t Make Me Think (Revisited)

Topic | UX, Product management, Web design

Web and App’s became our window of everyday activities, social interaction, shopping and most of of work (certainly so during COVID). In 1990’s and 2000’s institutions and businesses were trying to impress us by physical real estate. But how do us digital institutions treat now?

This book is for everyone, who wants to grasp the basics (yes, it is starting from ground) of how to design digital interface on web or app. Even though this might sound like UX designer guideline (which I was happy user if it was), it is really served in down to earth language and does not require from you any design domain knowledge. (but it leaves you with some after you read through).

It is not long read and I strongly encourage anybody interacting in our with Web and App’s (or have a say in their design) to at least skim through this. No regret move!

Link | https://www.amazon.com/Dont-Make-Think-Revisited-Usability/dp/0321965515

.

Extremely ONLINE

Topic | Creators, Social Media

At first glance, the subject of online influencers might not seem like a page-turner. However, a friend’s recommendation led me to Taylor’s exploration of the hidden layers behind social media’s evolution, and I was instantly captivated.

This book isn’t just a timeline of social media from the late 90s; it’s a narrative that weaves through the changing social dynamics influenced by online platforms. It provides an intriguing mix of statistical data and storytelling, revealing how various online communities engage with social media.

The book also offers surprising insights into questions like:

  • What was the first major topic that sparked the blogging revolution?
  • How did the requirement for influencers to disclose sponsorships impact the effectiveness of advertisements?
  • What truly contributes to societal polarization if not social media algorithms?
  • Which other social networks suffered at the hands of Twitter?

️| For those in marketing or content creation, this book is an essential read from start to finish. It’s equally crucial for parents or soon-to-be parents to understand the evolving relationship between kids and social media.

For me the book has a bit special twist, that is likely to work for you as well if you are in your late 30’s or 40’s. It maps the development of internet consumption for our generation, as when blogs hit the internet was exactly the time that our generation started to interact with it.

Link | https://www.amazon.com/Extremely-Online-Untold-Influence-Internet/dp/1982146869

.

Machine Learning Design Patterns

Topic | Machine Learning, Data Science

This book feels like the Swiss Army knife for machine learning enthusiasts. It’s the first of its kind as it dives into the wild world of ML design patterns. Forget about dry, technical jargon; this book is like a treasure map, guiding you through 30 quirky, yet ingenious design patterns, each one a secret weapon against those head-scratching ML problems. It’s like finding cheat codes for a video game, but for machine learning!

Imagine a cookbook, but instead of recipes for apple pie, it’s chock-full of solutions for when your AI project decides to go on a coffee break. Whether you’re a seasoned data scientist or just someone who accidentally wandered into the machine learning aisle, this book is your trusty sidekick. It’s the kind of read that makes you think, “Ah, so this is what Google’s brainiacs do for fun!” – solving problems and making ML as approachable as a friendly robot assistant.

Link | https://www.amazon.com/Machine-Learning-Design-Patterns-Preparation/dp/1098115783

.

CRUX

Topic | Strategy, Business Analysis

As someone with a background in Strategic Management, I’ve devoured nearly every strategy book available. Through my extensive reading, I’ve discovered two authors who consistently deliver valuable strategic insights: #GaryHammel and #RichardRumelt.

‍♂️ Therefore, to no surprise, Richard Rumelt’s #CRUX stands out as a masterpiece (again). It skillfully guides you in crafting authentic strategies for your business or team and shatters common executive misconceptions, like the necessity of a mission statement, misconstruing international expansion as strategy, or overvaluing shareholder interests. It’s also an excellent resource for learning to spearhead genuine strategic development.

I strongly recommend this book to all executives. Be prepared for a reflective and sometimes uncomfortable journey through your previous strategy endeavors. It’s equally insightful for middle managers, equipping them with the knowledge to challenge and refine the strategies proposed by their higher-ups. Overall, it’s a perfect read to gift yourself or others during a vacation.

Link | https://www.amazon.com/Crux-Richard-Rumelt/dp/1788169514

.

The Choice Factory

Topic | Marketing, Psychology, Feature engineering

The Choice Factory” by Richard Shotton is an exceptional read, especially recommended for data analysts focused on human behavior modeling and prediction, as well as marketers seeking to boost their marketing conversions via leverage (or taking tail-wind of) natural human tendencies.

What sets this book apart is its reliance on proven real-world best practices, presented not as isolated case studies, but as principles backed by comprehensive research. Another key strength of the book also lies in its concise, easily digestible chapters, each ending with practical, actionable advice on how to implement these insights.

.

I strongly endorse this book for anyone looking to gain a deeper understanding of human behavior in feature engineering for ML prediction models or for marketing optimization context.

Link | https://www.amazon.com/Choice-Factory-behavioural-biases-influence/dp/085719609X

.

The Ruthless elimination of Hurry

Topic | Work-Life balance, Mental health

The Ruthless Elimination of Hurry,” as the title aptly indicates, is more than just a book; it’s a compelling manifesto advocating for a deliberate shift away from the relentless pursuit of speed for its own sake.

In our fast-paced world, where speed is often synonymous with efficiency and success, this book presents a refreshing perspective. It acknowledges that while speed can be beneficial (except when it leads to a speeding ticket!), it shouldn’t be the primary objective. Speed should be a tool, employed judiciously and only when truly necessary. The book emphasizes the importance of intentionality in our actions, encouraging us not to rush mindlessly but to consider the purpose and value of our speed.

Authored by John M. Comer, a U.S. pastor, the book is understandably infused with religious references and teachings, particularly focusing on Jesus and other Christian elements. For some readers, this religious aspect might seem predominant, but the book’s core message transcends religious boundaries. If one can look past the religious overtones, or perhaps even draw insight from them, “The Ruthless Elimination of Hurry” reveals itself as a deeply thought-provoking and intriguing read.

It’s a book that challenges the status quo of our hurried lives. It invites readers to pause, reflect, and reconsider the pace at which we live. The author’s insights offer a unique perspective on how slowing down can lead to a more fulfilled, purpose-driven life. This makes the book an essential read for anyone feeling overwhelmed by the ceaseless rush of modern life and seeking a path to a more balanced, intentional existence.

Link | https://www.amazon.com/Ruthless-Elimination-Hurry-Emotionally-Spiritually/dp/0525653090

.

Data Science on AWS

Topic | ML operations, Data Science, Data engineering

Ah, the wild ride of prototyping machine learning models! Many of us have gone through fast prototyping (or toy examples) of the Machine learning clustering or prediction models in notebooks or sand-box environments. It’s like building a Lego castle in your living room – fun, easy, and oh-so-satisfying. But then, you decide to move that castle to the real world, and suddenly, it’s like trying to assemble it in a windstorm. Surprise! Porting your perfect little prototype into the jungle of a live environment is like herding cats while juggling.

Most of today’s implementations are left with no choice but to run in cloud, virtual machines set-up. Requiring additional complexity and care to even deliver the bleak functionalities of the easy, local machine PoC. This book is about how to think of Machine Learning aspects of live solution in advance. To understand what combo of the tools one should expected to be deployed, to run your machine learning train properly on rails. It is must-read text not because you will be ever coding the things and connectors mentioned in material. It is essential rather because you need to understand what everything your teams have to go through to make it all happen for you.

Link | https://www.amazon.com/Data-Science-AWS-End-End/dp/1492079391

.

Text As Data

Topics: NLP, Machine Learning

As the title of the book rightly suggests, text has been for long perceived as special “animal”. On the edge of the data analytics, much more obscure than analysis of the relational data by SQL or by Predictive analytics. Text analytics was also managed by dedicated (python) packages and often by NLP-specializing-only staff. If you were not one, you would probably just reach out for (simplified) predefined functions in NLTK (or similar code library).
Those times are over. Text is mainstream. If you were not convinced before ChatGPT burst, now there is no way to disprove it. But Text analytics still finds the audience (and practitioners) left in pre-text era, only having rough idea how to address data that is stored in troves of text.

Therefore, This book comes as a kind of gift. If you admit to be one of those having general (read limited) only understanding of insight extraction from text and how to set-up the text analytics in your team, if you have not been treating text equally heavy as ML or Reinforcement learning, this book helps you to close that gap. It’s well written and always illustrated on telling examples. If you missed to buy the ticket for departing text analytics “train”, this is your fast track to get on it.

Link | https://www.amazon.com/Text-Data-Framework-Learning-Sciences/dp/0691207550

.

The Coming Wave

Topic | AI, Philosophy

Hold onto your hats, folks! Mustafa Suleyman’s “The Coming Wave” isn’t just a book; it’s like a roller coaster ride into the future, where your coffee maker might be plotting world domination. Suleiman, the AI whiz-kid and DeepMind co-founder, is dishing out a buffet of mind-boggling predictions. Imagine a world where your vacuum cleaner is judging your music taste and your fridge is gossiping about your late-night snack habits. That’s the kind of AI party Suleiman’s inviting us to.

But wait, there’s a catch. It’s not all about tech wizardry and gadgets having a mind of their own. Suleiman waves a big, bright warning flag about AI’s dark side. Picture a world where AI is like that one overachieving cousin who’s great at everything but sometimes scares the living daylights out of you. He’s like the cool uncle of the tech world, telling us to enjoy the party but maybe hide the fine china just in case.

So, whether you’re a tech-head, a skeptic, or just someone who’s curious if your phone is silently laughing at your TikTok attempts, “The Coming Wave” is your handbook for the AI age. It’s like a survival guide for the digital jungle, complete with a map, a flashlight, and a slightly ominous warning about the creatures lurking in the shadows. Buckle up and get ready for a wild ride into the future, where your toaster might just be the smartest thing in your house!

Link | https://www.amazon.com/The-Coming-Wave/dp/1847927491

.

Julia High Performance

Topic | Data engineering, Data Science

No, this is not a mesh of the Shakespeare’s famous love novel and Performance marketing guide. Julia might still be the new kid on the block in the programming world, especially compared to Python, the reigning “lingua franca” of data science. But don’t be fooled – this emerging language packs a punch with its speed and efficiency. “Julia High Performance” by Avik Sengupta and Alan Edelman is like the ultimate guidebook for this speedster of a language.

Think of this book as your go-to manual for making your code run like a sprinter on a caffeine high. It’s like a masterclass in getting the most out of Julia, from understanding its high-speed capabilities to avoiding performance roadblocks. While some readers might wish for a deeper dive into the more intricate examples, the book remains an eye-opener, proving its worth by empowering users to supercharge their projects, leaving Python in the dust. Some users even boasted a tenfold performance boost after switching from Python/NumPy to Julia – think about leaving the comfort zone and head towards a coding glow-up!

This book, admittedly,  is a bit of the Joker card, but if you did not pick anything above and you are reasonably fluent in Python coding, maybe give it a try.

Link | https://www.amazon.com/Julia-High-Performance-Avik-Sengupta/dp/178829811X

.

DATA JOBS MARKET GERMANY | 2023-06 update

Data job market continuously shrinking | Even Data Engineering in drop, though least of all Data jobs| Stuttgart overtaking Cologne in most of the Data job categories | Pricing analyst (as separate category) almost evaporated | Smaller German cities still in hunt for Data analysts

 

Every month I try to bring update to the German labor market in area of Data professions. Feel free to use this overview for you own orientation or for scanning market opportunities (which ever side of the job interview table you plan to sit on 😊 )  This report is by no means intended to replace official job market stats, so please note that it is commenting development in monthly batches and there can be other sources that describe the job market dynamics in more granular form.

 

Data Engineers – drop into decline as well

Though Data engineers was the “last fort standing” in German data jobs market, in June 2023 it falls into 1.1% decline as well. This is due to different dynamics in Berlin (where demand dropped -19% MoM) and Munich that is still hungry for new Data Engineers (number of open roles wen up by +12% MoM). Interesting change is also happening in west and west-south Germany, where demand in Cologne dropped so drastically (-20%) that it fell even behind Stuttgart (growing by +7%) for the first time in measurement history.

When it comes to demand for different seniority levels, the vast majority of open positions remain without any seniority indication (and go with just generic Data engineer). Among those explicitly looking for Senior data engineers, the demand increased by ~170 positions, taking the share of open Senior positions to 17.6% from all vacancies. ON the other extreme of the spectrum, there is about same number of open Junior Data engineers job ads, accounting for 6.2% of all open data engineering vacancies.

Demand by #worklocation:

#BERLIN                           16.2%   [1 792 open positions]

#MUNICH                        12.4%   [1 372]

#HAMBURG                    6.3%     [   697]

#FRANKFURT                   5.6%     [   620]

#COLOGNE                      3.4%     [   376]

#STUTTGART                   4.2%     [   465]

 

Data engineering jobs by the #SENIORITY:

#Junior                                            6.2%    [   686]

#Midlevel (or unspecified)          76.2%   [8 431]

#Senior                                          17.6%   [1 947]

 

 

Data Scientists – Falling through a hole

Also jobs in area of Data Science have been are slowly (by -6.5%) declining already before, the dynamics accumulate to pretty bleak total picture. Though there is till 40K data scientists wanted in Germany, in last 3M the demand dropped by whopping 19%. The slash is most visible in Junior spectrum, where there is -37% drop in demand MoM. Generic positions were also on decline (they dropped by -9%), but companies’ demand still grows in explicit Senior roles (+13%). Not sure if this is to already attributable to “GPT-effect”, but being Senior DS certainly puts you on the more promising side of the job market “river”. At least for now.

Geographically some interesting moves are happening as well. While Top 3 German Data Science hubs (Berlin, Munich and Hamburg) are already on brakes (~ -20% MoM), the south-west (Frankfurt, Stuttgart and cologne) still did not get what they were looking for (stable demand with even +1% growth). Out of these secondary hubs, the market has almost frozen in Cologne, to the point that Stuttgart overtook Cologne also in Data Science positions. Other 2nd tier cities like Frankfurt and already mentioned Stuttgart keep their Data Science appetite still high, so some hopes to get interesting job offer there are alive.

Demand by #worklocation:

#BERLIN                           14.2%   [5 716 open positions]

#MUNICH                        11.5%   [4 629]

#HAMBURG                      5.7%    [2 295]

#FRANKFURT                     6.7%    [2 697]

#COLOGNE                        2.0%    [   805]

#STUTTGART                     3.3%    [1 328]

 

 

Data engineering jobs by the #SENIORITY:

#Junior                                          6.9%     [  2 778]

#Midlevel (or unspecified)          67.0%   [26 972]

#Senior                                          26.1%   [10 057]

 

Data Analysts – Only midlevel BI keeping somewhat afloat

The market of the Data analysts is also in several months falling streak. In June the drop is -6.8%.  The only sub-group of analytical jobs that keeps the line of demand are Business Intelligence analysts, who recorded +2%, all other analytical positions shrink the open positions demand. The trends are not positive for the edges of the seniority spectrum, where only the mid-tier was able to keep itself afloat. Within the last month the market has dropped appetite for both super Senior as well as Junior positions. Interestingly enough, the pricing analyst market is almost non-existent. In whole Germany, there is less than 30 open positions for Pricing analyst in total.

Geographically the development is having its own branches as well. Most big German cities (Munich, Hamburg, Cologne, Dusseldorf) are deep in the declining trend of the Data analysts’ positions. Contrary to development in Data Science and Data Engineering, where Stuttgart is booming, in Data Analytics it records the hardest percental drop (-33%). On the contrary two German hubs where the demand is still on rise in Berlin and Frankfurt, where MoM there were more open positions, despite the general decline on federal level. So where does the drop really happen? Well, it is smaller cities and rural areas that dropped the ball in last month. You can see that well also from the fact that while in May the share of top 7 cities together held 45% of all open data analysts offers, in June it is up to nearly 49%, signaling the higher absence of the smaller cities in the jobs mix.

 

Demand by #role:

#BI                                   25.4%   [10 770 open positions]

#CONSULTANT                17.4%   [7 372]

#MARKETING                    2.2%    [   916]

#SALES                              0.5%    [   226]

#PRODUCT MNG.             0.6%    [   254]

#PRICING                           0.1%    [     21]

 

Demand by #worklocation:

#BERLIN                           11.7%   [4 951 open positions]

#MUNICH                        9.8%     [4 137]

#HAMBURG                    8.2%     [3 494]

#FRANKFURT                   6.9%     [2 936]

#COLOGNE                        4.7%    [2 005]

#DUESSELDORF                 4.0%    [2 697]

#STUTTGART                     3.2%    [1 338]

 

Data engineering job by the #SENIORITY:

#Intern                                          0.4%     [     196]

#Junior                                          9.1%     [  4 144]

#Midlevel (or unspecified)          80.9%   [36 831]

#Senior                                          9.6.4%  [  4 367]

 

In general, after somewhat cloudy spring, the market of open positions in data jobs in full decline on all three important verticals (Data Engineering, Data Science and Data analytics). If you live in big cities it might not feel like that because there are usually 500+ positions to choose from (which sounds like a plethora of choice without relocation need). But one should realize that fewer and fewer open positions signal that companies are not in hiring sprees. That also means that budgets will be tighter and salary ceilings not that high above as before. From my own experience as hiring manager for data roles in www.flaconi.de I can also add that international candidates (mainly from outside of EU) are still eager to take their chance to shine. Thus, the competition is getting tighter as well. When you plan your next career move no German data jobs market, do a bit of your research before “jumping into water”. Good luck and see you in the next edition of this regular report.

DATA JOBS MARKET in GERMANY | 2023-05 overview

Data job market slowly shrinking | Most stable in Data Engineering, but leaning rather towards mid-spectrum | Munich still desperate for Data Scientists, in Hamburg and Cologne the Data Science demand dropped | Data Consulting jobs evaporated | Smaller German cities still in hunt for Data analysts

 

Every month I try to bring update to the German labor market in area of Data professions. Feel free to use this overview for you own orientation or for scanning market opportunities (which ever side of the job interview table you plan to sit on 😊 )  This report is by no means intended to replace official job market stats, so please note that it is commenting development in monthly batches and there can be other sources that describe the job market dynamics in more granular form.

 

Data Engineers – close to stagnating

Gradual cool down of the data jobs demonstrates itself also in the Data Engineering space, but the drops in demand for this profession are the mildest and Data engineering is only less than 1% below stagnation trend. Interestingly, Berlin and Munich are still hungry for new Data Engineers (have higher number of open positions than last month), but secondary hubs (like Hamburg or Frankfurt) already filled in many positions (or withdrew their hiring intentions).

When it comes to demand for different seniority levels, vast open positions do not indicate any seniority requirement (and go with just generic Data engineer). Among those explicitly looking for Senior data engineers, the demand has dropped by ~ 300 positions, taking the share of open Senior positions to 15.9% from all vacancies. ON the other extreme of the spectrum, there is 200 less open Junior Data engineers job ads, accounting for 5.7% of all open data engineering positions.

Demand by #worklocation:

#BERLIN                           19.7%   [2 203 open positions]

#MUNICH                        10.9%   [1 219]

#HAMBURG                    7.7%     [   861]

#FRANKFURT                   6.0%     [   671]

#COLOGNE                      4.2%     [   470]

 

Data engineering job by the #SENIORITY:

#Junior                                            5.7%   [   637]

#Midlevel (or unspecified)          78.4%   [8 768]

#Senior                                          15.9%   [1 778]

 

Data Scientists – Some cities getting into desperate mode

Also jobs in area of Data Science are slowly (by -6.5%) declining in number of open positions, though the base is still well above 40 000 vacancies. Generic positions were less prominent (their share dropped below 70%), companies’ demand rather grows in explicit Senior or Junior roles. That usually signals that companies with more clearer projects in mind spearhead the development in last weeks.

Geographically interesting play unveils. While Berlin (and Hamburg) slowly step-by-step saturate their Data Science needs, Frankfurt and Munich can’t get enough of what they want. The situation seems to be getting desperate mainly in Munich, which is the only larger German city where the demand for Data Scientist is still significantly growing (+21% vs. overall -7% drop in Germany). If the situation persists this might overheat the local market leading to compensation bands piking steeply up.  On the contrary the market has almost frozen in Cologne, where within 1 month there is 1300 less open Data Science positions. With such a strong tempo of decline, this can’t be possibly just positions being filled-in do fast and thus rather signals a lot of companies with-drawing their original requisitions.

Demand by #worklocation:

#BERLIN                           17.2%   [7 438 open positions]

#MUNICH                        13.0%   [5 622]

#HAMBURG                      6.5%    [2 811]

#FRANKFURT                  6.2%    [2 681]

#COLOGNE                        1.9%    [   822]

 

Data engineering job by the #SENIORITY:

#Junior                                          10.3%   [  4 454]

#Midlevel (or unspecified)          68.3%   [29 538]

#Senior                                          21.4%   [  9 255]

 

Data Analysts – Consulting jobs evaporated month on month

Market of the Data analysts gets saturated the faster, where the drop in demand was -7.3%. The main driver for this is sudden drop in demand for consultants with data analytical roles, where almost 40% of last months consultant roles are not advertised any more (compared to month ago). The outlook of consulting companies (and units) is pretty distressed and, hence, hiring of these roles stepped understandably “on breaks”. More positive trends in specific analytical roles are in Marketing and Pricing, where number of open positions stagnates (or even slowly grows).

All major big German cities (Berlin, Hamburg, Munich, Frankfurt, Cologne) seem to be jumping on the declining trend of the Data analysts’ positions. The development was fastest in Frankfurt, where demand dropped by almost 38%. Very different picture we can see in lower tiers of the analytical hubs (like Stuttgart and Dusseldorf ), where similar number of open data analyst positions still preserves. So if you are willing to move (or work remotely) to smaller city, your chances of being premium (and wanted) candidate are significantly better there.

An interesting trend is also that data analytical positions are dropping so fast, that if they sustain this trajectory than next month (in June 2023) there might be more Data Science positions open than Data analysts. This would also confirm that with (generative) AI booming, companies rather seek talent from more sophisticated tiers of data skills. We will closely watch the development and debate it in more detail in next edition of the job market scan.

Demand by #role:

#BI                                         23.1%   [10 497 open positions]

#CONSULTANT                16.9%   [7 673]

#MARKETING                    2.0%    [   920]

#SALES                                  0.5%    [   214]

#PRODUCT MNG.             0.4%    [   173]

#RICING                                0.1%    [     48]

 

Demand by #worklocation:

#BERLIN                           10.4%   [4 714 open positions]

#MUNICH                        9.6%     [4 368]

#HAMBURG                    7.7%     [3 515]

#FRANKFURT                   4.5%     [2 045]

#COLOGNE                        4.5%    [2 039]

#DUESSELDORF                 4.6%    [2 088]

#STUTTGART                     4.4%    [1 983]

 

Data engineering job by the #SENIORITY:

#Intern                                          0.4%     [     196]

#Junior                                          9.1%     [  4 144]

#Midlevel (or unspecified)          80.9%   [36 831]

#Senior                                          9.6.4%  [  4 367]

How ChatGPT really works in SIMPLE WORDS (and pictures)

Many of us have probably already played with new-kid-on the-block of the Artificial intelligence space, ChatGPT from OpenAI. Providing prompt of any question and getting no-gibberish, solid answer, very often factually even precise is fascinating experience. But after few awe moments of getting answer to your “question of the questions” you maybe wondered how does the Chat GPT actually really work?

If you are top-notch Data scientist you could probably go into documentation (and related white-papers) and can simulate (or even write own) transformer to see what is going under hood. However, besides those few privileged, usual person is probably deprived of this, ehm, joy. 😊 Therefore, let me walk you through the mechanics of ChatGPT in robust, but still human-speak explanation in next few paragraphs (and schemas). Disclaimer = I compiled this overview based on publicly available documentation for the 3.0 version of the GPT. The newer versions (like 4.0 ) work with same principles but have different size of neuron nets, look-up dictionaries and context vectors, so if you are super-interested into how the most recent version works, please extend your research beyond this article)

 

6 main steps

Even though our interaction with ChatGPT looks seamless, for every query to it, there are 6 steps going on (in real time). Media label the ChatGPT in single phrase as “artificial intelligence”, but it is worth mentioning that of these 6 steps, only 2 and half are actually real artificial intelligence components. Significant part of the ChatGPT run is actually relatively simple math of manipulating vectors and matrices. And that makes the details of the ChatGPT even more fascinating, even for the “lame” audience.

 

It’s start with compressing world into 2048 numbers

The first step of the ChatGPT work is that it reads through the whole query that you provided and scans for what are you actually asking. It analyzes the words used and their mutual relations ships and encodes the context (not yet the query itself, just the topic) of the question. You might be amazed by fact that ChatGPT converts whole world and possible questions you ask into combination of 2048 topics (represented by decimal numbers). In a very simplifying statement you can say that ChatGPT compresses the Internet world into 2048-dimensional cube.

 

Context first, then come tokens

As outlined in previous paragraph, in process of answering our prompt the ChatGPT first takes some (milliseconds) time to under the context of the query before actually parsing through the query itself. So after it decides, who area(s) of “reality” you are interested in, than it meticulously inspects your entire question. And it literally does so piece by piece, as it decomposes the given question into tokens. Token is in English usually a stemmed word (base) with ignoring the stop-words or other meaning non-bearing parts of the text. In other languages token can be obtained differently, but as rule of the thumb:  number of tokens <= number of words in the question.

For every token the GPT engine makes a look-up into predefined dictionary of roughly 50K words.  Using hashed tables (to make the search super fast), it retrieves a vector (again 2048 elements long one) for each token. This way each word of the query is linked to topic dimensions. As the system does not know in advance how many words will your request have, there needs to mechanism to accommodate for any (allowed) length of the query. To be flexible with this, chatGPT forms a extremely long vector (2048 * number of tokens), in which the sub-vectors coming from dictionary lookup for each token is arrange one after another into sequence. Therefore 100 words long query might have even up to 204 800 vector elements. even larger 500 words request might have more than 1 mil of the letters. This vector is than processed, but first we need to do one more important change.

Where to look (or How to swim in this ocean of data)

As we learned 500 words long request to ChatGPT might arrive at more than 1 mil numbers encoding this request. That is a real ocean of the data. If you as human received such a long prompt for answer, I guess you would struggle even with where to focus the attention first place. But no worries here, so would the GPT if it was not for the Attention mechanism. This AI technique researched only in last 10 years (papers from 2014 and 2017) is the real break0through behind GPT and is also the reason why language models were able to achieve the major step-up in “intelligence” of communication.

The way that Attention mechanism works, it calculates (still through linear algebra matrices) pair of two (relatively short vectors) for each of the token. These vectors are labeled as KEY and VALUE. They are representation of what is really important (and why) in the text. This way the engine does not force neural network to put equal weight ( = focus) on all million input numbers, but select which subsection of the query vector are crucial for answering the question. When then combined into transformed SUM of the elements, it provides the recipe for how to “cook” the answer to question. what might sound like (yet another) complication, is actually key simplifier and energy saver. While past approached to language moles assumed “memory” holding equally important each word of the query text (or assigning same, gradual loss of attention into previous words). That was prohibitive expensive and hence limited the development of better models. Therefore,  jumping over the attention hurdle unlocked the training potential of AI models.

 

Finally AI part

It might be counter-intuitive for many, but first 3 steps of the GPT have actually nothing to do with Artificial Intelligence. It is only step 4, where the real AI magic can be spotted. Essence of the 4th step is the Transformer core. It is a deep neural network, with 96 layers of the neurons, a bit more than 3000 neurons in each of the layers. The transformer part can be actually named also the Brain of the GPT. Because it is exactly the transformer layers that store the coefficients trained from running large amounts of texts through neural network. Each testing text used for training of the AI, leaves potentially trace in the massive amount of the synopses between the GPT “neurons” in form of the weight assigned to given connections.

As unimaginable the net of hundreds thousands (or millions) neurons are to us humans, so is the actual result of the Transformer part of GPT is probability distribution. No, not a sequence of words or tokens, not a programmed answer generating set of rules, just probability distribution.

 

Word by word, bit by bit …

Finally in step 5 of the Chat GPT we are ready to generate the textual form of the answer. GPT does that by taking the probability distribution (from step 1) and running the decoder part of the Transformer. This decoder takes distribution and finds the most probable word to start the answer with. Then it takes the probability distribution again and tries to generate second word of the answer, and third, then forth and so on, until the distribution of probabilities calls special End-of-request token. Interestingly enough, the generation does not prescribe how many words will the answer have, neither it defines some kind of satisfaction score (on how much you answered the query already with so-far generated sequence of words). Though ChatGPT does not hallucinate the answer or bets on single horse only.  During the process of the creation of the answer there are (secretly) at least 4 different versions (generated using beam search algorithm). Application finally chooses one that it deems most satisfactory for the probability distribution.

 

Last (nail) polish

As humans, we might consider the job done by step 5 already, so what on Earth is the sixth step needed for? Well anybody thinking so, forgets that human person talking formulates the grammatically correct (or at least most of us) sequence. But AI needs a bit of the help here. The answer generated by Decoder still needs to undergo several checks. This step is also place where filtering or suppressing of the undesirable requests is applied. There are several layers on top of the generated raw text from previous stage. This is also (presumably) place where translation from language to language happens (e.g. you enter you question in English, but you ask GPT to answer in Spanish).  The final result of the query answer has been delivered, user can read through. And ask next question 🙂

The flow of the questions in the same conversation thread can actually lead to updating or tweaking the context parameters (Step 1) of given conversation. The answering context thus gets more and more precise. Strikingly, the Open AI’s GPT models actually store each of the conversation, so if you need to refer back to some past replica of conversation, GPT will still hold the original questions and answers of that talk branch. Your answer (and questions) remain thus historized and in full recall any time in future.  Fascinating, given the number of users and queries that they file.

 

Steps Summarized 

The above described steps of the GPT answer building have been neatly summarized into following slide, providing additional details and also indicating the transformations made in individual steps to enable the total answer flow. So if you want to internalize the flow or simply repeat the key training architecture/principles, please read through the following summary:

 

Few side notes to realize …

Though the actual mission of this blog post is to walk the reader through the (details of) process of generating the answer to the query prompt for GPT, there are few notable side facts stemming from the way that GPT is internally organized. So if you want to collect few “fun fact” morsels that make you more entertaining dinner buddy for your next get-away with friends (or for Sunday family lunch), here is few more interesting facts to be aware of (in GPT realm):

And bit of zoom-out view

Besides the fascination with HOW actually ChatGPT works, I often receive also questions about it’s future or speed of the past progress. I summarized the most common questions (I received) into below show-cased 1-pager. So if your curiosity is still on high level, feel free to charge yourself with these FAQs: