Mining Your Business

Object Centric Process Mining with Wil van der Aalst, the Godfather of Process Mining

March 02, 2022 Mining Your Business Episode 30
Mining Your Business
Object Centric Process Mining with Wil van der Aalst, the Godfather of Process Mining
Show Notes Transcript

Wil van der Aalst needs no introduction. The Godfather of Process Mining comes to Mining Your Business podcast to share his story about how Process Mining became the thing we know today. As an exemplary researcher Wil also talks about shortcomings of current discovery methods and discusses the object centric process mining, an enhanced approach that can get even more out of process mining initiatives.

Learn more at the Processand website!

Follow us on our LinkedIn page here: LinkedIn
Learn more about what we do at Processand here: Processand

00:00

Patrick:  

Hey there, Jakub. 

 

00:02

Jakub:

Hey, Patrick. 

 

00:04

Patrick:

You know, I still remember the day when we started the podcast and we were joking. Hey, wouldn't it be amazing if we got Wil van der Aalst on the show?

 

00:09

Jakub:

Yeah, right. 

 

00:10

Patrick:

Exactly. This is the minding your business podcast a show all about process mining, data science and Advanced Business Analytics. We'll have under almost Godfather objects and to process mining the future and much more. Let's get into it.

 

00:33

Jakub:

I was thinking really hard about how to introduce today's guest. After all, if it wasn't for him, there would have been no podcast mining your business who knows what Patrick and myself would have been doing in the first place. Anyway, I figured the best way to start would be to cite Wikipedia. The term process mining was first coined in a research proposal written by the Dutch computer scientists, also known as godfather of process mining does begin a new field in research that merged under the umbrella of techniques related to data science and process science at the Eindhoven University in 1999. The name of the paper was process designed by discovery, harvesting workflow knowledge from ad hoc executions, and it was written by one and only Wil van der Aalst. Wil, I can't even express how excited we are to have you on our podcast. Thank you very much for coming and accepting our invitation.

 

01:24

Wil van der Aalst: 

Very happy to be here. Curious for your questions.

 

01:30

Jakub:  

Yeah, we are, you know, our questions are always on top and on point. I guess the first one that's really hanging here in the air is how does it feel now to be to be understood and named the godfather of process mining?

 

01:45

Wil van der Aalst:  

It's a funny story. So many people asked me where did the name originate, etc. And I know it has been, it has been used for over 10 years. It's completely unclear who started it, etc., etc. But more and more people start to adopt it. And of course, this is a great honor to be called like that. I hope people don't associate me to the process mining mafia, but more as the person who has been trying to help many people that want to work on the topic of process mining, and I feel a bit like a father of the people that were, let's say, working very early at the field. So I'm happy with the term, although it sounds a bit strange.

 

02:37

Jakub:  

Well, may I ask what was the original paper that you came up with really about? Like, how did you even come up with this idea on process mining in the first place?

 

02:47

Wil van der Aalst:  

Now, if you look at my career, so many people know me now, from the from the work on process mining, but actually in the 90s, I wrote one of the first probably the first textbook on workflow management systems. So I was a big believer in workflow management technology. I also worked with people like skip Ellis that were already working on these things in the 1970s. And I believe that workflow management was going to be used by any company in the world, that it would be as common as we see database systems that any organization would be using that. And for me, it was completely logical. So in the mid 90s, let's say almost all of the things that I was doing, were focused on this idea of automatically generating information systems based on process models, often expressed in terms of Petri Nets at the time BPMN and etc. did not exist. And I really believe that that was going to be used in any organization. A few years later, I realized that I was wrong. I was, at that point in time, also working as a part time consultant for a company called Buchanest, it is now part of Deloitte and inside Buchanest there was a workflow group that was guiding larger organizations to select workflow management technology, and to also implement it. And what for me was very surprising was that, let's say most of the companies that decided that they wanted to buy workflow management technology, in the end, bought it but never used it. So they bought the software, but then they had incredible difficulties. The difficulty was that it's very easy to make a PowerPoint that people understand and have an idea what the process is about. But it's incredibly difficult to create an information system that supports the process as a whole. So one often uses the 80 20 rule If you do analysis, 80% may be enough. But if you implement a system, a system that functions for 80% is not system, well just imagine your car, show that 80% of the functionality is there and then 20% missing, you will not drive that car, you will consider it to be way too dangerous. And that happened. So I realized in the late 90s, that most of the processes were actually much more difficult than what people were thinking. And at the same time, I was a bit bored, let's say writing papers that were very much Model Driven, because I felt that they were too idealistic and did not capture the real problems. So that's why it was probably around 97 98, that I started to look at the following interesting academic problem. It was not about business at all. It was it was the question, okay. If you have a Petri Net, which describes the process, you can simulate it simulation and just we had been building already the 80s. So given a process model, you can simulate it so you can generate behavior. But can we now do the reverse, just observe what is happening and automatically generate a model? Today, that sounds completely normal. But at that point in time, it was a very strange idea. And I saw immediately that from a theoretical point of view, this is a great topic. And it has so many challenges in it. It was original. So I jumped on to the topic. And you referred earlier to let's say, this research proposal, it was research that I started in the late 90s, together with Toll Biters. That that was the first project. But basically, after a couple of years, basically, anything that I did was related to process mining. And the trigger for me was: a) workflow technology doesn't work. If it doesn't really capture the real process, b) from a scientific point of view, it is incredibly interesting. The problem, okay, you look at example behavior, and you automatically try to generate a process model. And I think that's, but people these days, look at the commercial process mining tool. That is what they are fascinated by.

 

07:33

Patrick:  

So, honestly, how do you feel how proud of you are you looking back from the early 90s from that original concept that you had to now the year being 2022? And having such a big splash in the in the enterprise with Celonis? I mean, I'm not sure if you've seen, but Celonis had an add during the Olympics. I mean, how does that feel?

 

07:54

Wil van der Aalst:  

So of course, it's like, I feel that I'm very lucky that I, let's say focused on that problem. And I'm very grateful that there's now so much adoption. But the thing that I should also mention is that it has been very different for a very long time, right? When I started to work on this, people thought that I was nuts. I was successful in country dance, I was successful in workflow management technology. People were fascinated by process modeling, and it was very hip in the late 90s, to model processes. And people did not understand why I was doing that. What I should also mention, and I think many people do not want to remember that. But as of let's say, the year 2000, I've been in contact with companies like Ideashare, which is now Software AG, I gave many presentations inside IBM, I gave present several presentations inside SAP. I was at HP, I was at Google, I was in all of these companies, I was traveling around the world, saying to these organizations, look, this is something that's going to be super important, it's fascinating and I think any organization, so if you think about SAP, if you think about Software AG at the time Ideashare, IBM, the thing that they want to do is that they want to support the efficient and correct execution of processes. So why didn't they do that? And for me that so on the one hand, I'm proud in the sense that like, despite all the people not buying it, I just continued working on it. And I'm very happy that today there are companies like Celonis that are super successful. Also, please note, that if you look at a company like Celonis, this also took a few years, right? What many people do not remember anymore is that in  the mid 2000s, several of my students started process mining companies. And there was a company called Futura process intelligence. Nobody knows that company today. But that company was run by Peter van den Brand who was one of the persons working for me. After drinking a couple of beers, I told him, Peter, you have to start the company, if you want to start a company, then start a company and then he started the company. In 2009, Futura process intelligence won the Gartner cool vendor award. I was giving talks at a large Gartner BPM conferences. And I thought, okay, now it is clear for everybody, this is what we need to do. It seemed completely clear. All the people in the audience they were fascinated by it, that this company won the cool vendor awards, and then all the people returned home and did nothing. So Peter, like the software was in the end taken over by Lexmark and still being used at the top places. But it's clearly that it was in a way too early, although it had many of the capabilities that you see today. I think if you look at the company like Celonis, also, when Celonis started, it was not so easy. So I very much remember giving a talk for Celonis in the Allianz Arena. And there are these business units. And they had like a conference, which was big for them. But we were like 10 people there. That was not logical at all. But let's say over time, and let's say in the last five years, it has become very visible, that is a big technology. And now suddenly everybody wants to have it and is willing to pay a lot of money for it. And I could have had it for free 15 years ago.

 

12:11

Jakub:  

Yeah, this is an incredible journey. And before we actually jump on our topic that we want to discuss a bit more in depth. I have one last, let's say a personal question. What was it that kept you going over all of this time and seeing so many difficulties and bottlenecks and push backs also from organizations that the adoption really took that long? What did get you excited and what gets you excited even today after so long in the in the field.

 

12:42

Wil van der Aalst:  

So I think that it is a great scientific problem, right? If you do model based research, you write a model. So perhaps the best way to explain it, if I read papers, from myself of, let's say 25 years ago, that talk about verification, and I would have to write a paper today, I would write the paper in exactly the same way. I wouldn't change anything. If I look at my process mining papers of 20 years ago, I would write them completely different. And this shows that this is a super interesting area, which is where many things are under explored. And that kept me going that along the way, there were so many interesting problems that we saw. I was also always convinced that in the long run, industry would adopt it. Because like I often use the term let's say what I call process hygiene. So many of the processes that organizations have, if you look closely at them, they have lots of problems. Doing process mining is as logical as washing your hands when you go to the toilet, you have to do it, you should not look for a business case. If you are responsible for a process. you should be you should be eager to be proud of it. And you should make sure that it runs well. And because of that I always felt well in the long run this is going to happen. Also, it is a relative, let's say cheap technology compared to many other things.

 

14:25

Jakub:  

Right. Yeah, again, interesting journey. Wil, and I think both myself and Patrick are thankful that you took it that far because, well again, then we can actually do this podcast and talk about it with you. But what we are right now, where we are with process mining, so we have the processes, a lot of companies are adopting it. But obviously there are a lot of other hiccups or issues, problems that we are trying to solve. Some of the problems are not really solvable with the current implementations and with the way that the algorithms are working right now. And one of these, let's say, focus that we are now looking at is something that's called Object centric process mining. And Wil, I would probably ask you to explain what object centric process mining is, before we jump into more questions and conclusions about that.

 

15:20

Wil van der Aalst:  

So many people like, if you see an introduction to the topic of process mining, it is clear, you need to have event data. And then you try to describe what is event data. And you can give examples of patients being treated in hospitals, suitcases going through airports, the handling of orders, in SAP systems, etc. But the concepts that you need to do basic process mining is that every event should refer to a case, have an activity label and have a timestamp. That's the minimal information that are at this point in time over 40 commercial process mining tools, there are several open source, they all expect that if you don't have events, where for every event, you have a case ID, a timestamp and an activity name, you cannot do anything. So that's like the conceptual model of basic process mining, is that. And it is fascinating that we have these three fields, case ID, timestamp, and activity, you can do so many things, you can automatically generate process models that describe what's going on, you can automatically generate simulation models, you can, if you have a BPMN and model or any other description of the process, you can do conformance checking, and you can see where are, the biggest deviations, where are the biggest bottlenecks. So it's great that you with this simple, very basic model, just case ID, activity name and timestamp, you can do all of these things. So people very quickly buy into this, and they can understand it and I do not believe many people would disagree with it. I do not believe in just looking at dashboards. When people do process mining and they look at the results, they have to understand the connection between the data and the results. It's the same as if you look at the spreadsheet, and you just look at the numbers and you have no idea how they are computed. It is very dangerous, you can go bankrupt. If you do that. The same thing is I feel that everybody using process mining should understand this. So this basic model has been very successful, up to the this point in time. But as applications become more ambitious, it is clear that model, this mental model that I just described, is also very limited. And object centric process mining takes away one of the constraints that is built in. And that that is the constraint that an event should refer to a case ID. So that constraint is being dropped. And the best way to explain it would be to think of examples. So if you place an order via some web shop, let's say Amazon, right? You select multiple items, and at some point in time you push the button, I want to order this, and then you do payment information, etc., etc. Now just look at at the moment where you push the button, place order. And that is clearly one event. The user doesn't click 10 times, no, this is one click, I want to buy these items. And now the question is what is the case ID? So now, if you have this classical view, you have to make a choice. So you can say, the case is the order, and that would be the most logical thing. But the order is consisting of multiple items. So you could also say that it is the item, but there is not just one item, I could have ordered five products and then there are five objects as we call them, that represent an order. It could also be interested that we are interested not so much in individual orders, but we are interested in the customer journey that the customer that place this order is later phoning a call center complaining about stuff and we want to relate that. So when the person was pushing this button, it referred to one order a bunch of items, but it also referred to the customer and I could go on and on and on. If it is a physical process, it could be the location, right, it's taking place in this machine or something like that, or this machine is being used, and typically not one machine is being used multiple machines may be used. If you are in a hospital and there are people standing around your bed. What is the case? Is it the doctor? Is it the patient? Is it the nurse? That shows that in reality, all the events that we witness in reality, do not refer to a single object, but to a bunch of objects. And that is the core idea of object centric process mining that you no longer say, an event refers to a signal case, because that's a very one-dimensional view, you say a case can refer to any number of objects. And if you take that mental model, then you can suddenly do much, much more than you could do before.

 

20:50

Patrick:  

Now, why is it such a problem to have this one-dimensional view of a case? Like what why is it wrong to look at the, I'm just gonna look at my order items, right?

 

21:02

Wil van der Aalst:  

So in principle, that is okay. But you could say it's 1d or 2d, but you're adding and you're basically removing a dimension, if you're focusing on a single object and there are certain things that you cannot see. And you're also often presenting diagnostics that are very misleading. So if I go back to my, okay, I'm on the Amazon website, and I have this event, I push the button, I want to order these five items. If I take the order perspective, it is quite natural to think of that as one event, right? If I take the item perspective, and I would have the classical view, so we would say okay, there is push button, and push button is related to items, then I need to replicate that event. Because if I have the requirement, every event should refer to one object, I could say, okay, I forget about the four other items, which is of course wrong. So the only way to incorporate that is that I make basically five copies of the same event and all of them are called push button. And they all refer to one of the items that I ordered. So now you could argue, okay, that is just fine. And in a way, you could argue that is just fine. But now, what we no longer see is the interaction between the item and the order. There many more objects that we saw, we don't see the relationships. That's one big problem. The other problem in this example is that if we compute KPIs, and we look, for example, at waiting time, or we look at costs, we look at deviations, right? And there was this one event place order, we push the button. But we have replicated that five times, it could be that our financial data is also replicated five times, the number of deviations is also replicated, etc., etc. So it becomes highly confusing. That problem, in my papers I've written about this phenomenon and that phenomenon is called, let's say, there are actually two phenomena, the divergence and convergence. So there is the problem, that you are replicating things, that becomes misleading. But it could also be in the example of pushing a button, I could say, okay, I should not choose item as a case identifier, I need to stick with the order. But if I do that, there will be later events that refer to just a single item. And this event, the processes may come be completely structured ,but because I put all the events of the items in an order in one bucket, I automatically get the spaghetti model because I'm not able to follow a single item. I realize this sounds very technical and without drawing it, it's very complicated. But the basic problem is that in reality, the things that happen, refer to multiple objects. And if you are forced to pick a single object one way or the other way you get into trouble. You either lose, you lose causalities between events, or you are replicating events, leading to very misleading diagnostics.

 

24:50

Jakub:  

I guess one of the examples that when I was reading your paper came up to my head is problem in a purchase to pay and accounts payable process and once I do have your orders your Purchase Orders. And on the other side, you have your accounting items. So what the process mining or what the implementations, what we usually do is, we take a look at the process from two different perspectives like from the perspective of the purchase order, and from the perspective of the invoices, the accounting. How is this, let's say, a right approach to go about solving this problem, at least with the technology that we have now at hands?

 

25:30

Wil van der Aalst:  

It, it depends, like, of course, given the current technology and most of the systems, so some of the system started to work on this. So in the academic world it is a hot topic, several people started to work on this in commercial systems, it still has to arrive. But if you have a system that forces you to take this single case perspective, that is the only way that you, you can go about it, there is no other way, the thing that you should realize is that by doing that, you're basically looking at a three-dimensional thing from two different angles. And you have made the world in a way flat, and that could be relationships between these different views that you no longer see. I think that's one problem. The other big problem, and I think that is even worse is that in most projects, most of the effort is put into data extraction. In the solution that you propose, you basically need to extract the data twice, or at some point in time, there is a split, where you go one way or the other way and this is labor intensive, but it also quickly leads to errors, right? Because, in one view, this row in SAP has this interpretation, and then it has all of you exactly the same row as a completely different interpretation. And that's very dangerous.

 

27:00

Patrick:  

Absolutely. So I mean, how would you go about explaining this concept is two dimensional view of a three dimensional object to a business user? That's honestly a lot, a lot of the times already overwhelmed with what they're seeing in a traditional one case notion.

 

27:20

Wil van der Aalst:  

Yeah, so if reality is complicated, you have to accept it. That's perhaps a very academic viewpoint. I think one thing that for sure, we need to do is we need to educate people better, right? It cannot be the case that people, that somebody has been working writing, like 10 pages of SQL scripts, to extract a particular event log, and then you look at the dashboard, and you just see the dashboard, or you do not understand and all the connection between that and what was happening in your system, I think is very, very dangerous. So to a business user, I would say you have to understand this. And the concept is also not so difficult. So this single, let’s say a case ID activity, name, timestamp was also difficult 20 years ago, right. And that was an abstract view on the world and now it is clear that that abstract view has reached its limits and it needs to be extended. What is also very important, so people have been drawing process models since the 70s. Right? And that has been a hobby for I've been involved in crazy projects, where were 50 people would be doing process modeling. And for me, it was completely unclear why they were doing that project and more important why somebody would pay for that, right? But if you look at the models that people typically creative, they handcraft these things by hand, they often have this problem. So if you look at the SAP reference model, I wrote an article about that the young mantling a very long time ago, I think the title was Something Betriebe like certain things were not correct in the SAP reference model that are related to this. So for example, if you ask a person, okay, make a process model to describe the hiring process of new employees. They started drawing boxes and arrows, etc., etc. But you can only describe that process correctly. If you realize that our activities related to applicants, and there are activities related to the position. And sometimes these things meet, but for example, if you if you create a new position in a company, you do not know who's going to apply, right? There are no such things. So I have an activity create vacancy or something like that, that is independent of the applicants. At the same time, people are applying, and perhaps people are applying for multiple jobs, etc., etc. And if you look at the models that people made, in the past, describing these types of processes, they are simply incorrect. And nobody realized it because everybody just sees a diagram with boxes and arrows. And as long as they recognize some of the the names of the boxes, they are okay with it. But that's, of course, very, very bad. And that if you want to describe these processes, it has to work. And now I link it back to what they said in the beginning. Exactly these things are the reason why workflow management technology failed, because people were trying to abstract it away. And you cannot abstract it away, because it's reality. And we are now having this podcast. So let's consider the podcast to be one event. There are three objects involved, right? There's a podcast series, that would be one object. And that is the three of us, right? So I just had lunch, right? I didn't have lunch with you, right? I had it with my family. And just if you look at reality, you will see that this is everywhere. Yeah, sorry business users, you're really need to understand this to improve your processes.

 

31:40

Patrick:

Yeah, no way out.

 

31:41

Jakub:

I must say, I love how you insert the ideas of process mining into our daily lives. We have this as an activity in our company that when we are bored, we just thinking of these random process mining use cases for our daily activities. We've had some good candidates that I'm probably not going to mention today. However, going back into the object centric process mining I have maybe one more question. You touched upon this topic a little, apart from, let's say, minimizing the errors and seeing the process as a whole. What kind of problems would it help solve for us?

 

32:14

Wil van der Aalst:

Yes, I think there are two sides to it, right? I think one side to it is that we need to make the data extraction simpler and more direct. We need to have event logs that reflect reality. If we do something different, it is more complicated in the end, right? So I strongly believe by embracing object centric process mining, the data extraction problem becomes easier because you try to do what you see in systems like SAP, you try to convert it what it was really in reality. And we forget about funny table names that all the complications that are there. We try to convert the data into the things that have actually happened and this could be order-to-cash and purchase-to-pay but also production, etc., etc. The image studies and information systems often has no semantics. It is just data and you need to convert it into events that correspond to the business events that have really happened. So I think that it will in the end, it will save a lot of effort and a lot of money. That's one view. The other view, I think there are many questions that we cannot answer today because the models that we use are too limited. And so we are looking at 2D and we know that is 3D and there are certain things that we cannot see and that sounds very abstract, but I strongly believe that we are reaching the limits of what we can do with this two-dimensional data. So we will be able to answer more advanced questions. So one of my favorite examples would be so that in research there are many people working on predictive analytics. So making predictions on top of process models and event data. Often they are taking this two dimensional view and they are missing a lot of crucial information that the commercial vendors are simply adopting does because they just include this simple capabilities and they can write on the product that it supports artificial intelligence and machine learning, etc. But if you think about it, of course this does not work. So if I, I don't know, drive by car from here from Aachen to Munich, or from Aechen to Eindhoven ,the brand of my car, the color of my car, my age, those are not the most important features. The most important features are the other people on the road. Anything that you would like to talk about processes, you need to be very holistic. So another example that I can give is that we had a project where in the end there were huge delays in the handling of cases. And in the end, the bottleneck were people that were spending just half an hour per week on the process. So then we can say, okay, we just focus on that half an hour and we ignore everything else. But it shows that it's not working because if we if we just look at this, we think that cannot be a bottleneck, but it is a bottleneck because it's interacting with other processes. And this is going from a single case ID to multiple case IDs will also play an important role. If we want to build a digital twin of an organization like we can only create a realistic simulation model that behaves like reality if we make it 3D, right? If we make it 2D, it is not going to show the problems. It's also if we look at supply chains, like to understand delays on the supply chain. It is not enough to just look at one node in the network. We need to look at all of them. So I think lifting this level to multiple objects, multiple processes, multiple organizations is key.

 

36:51

Patrick:

So what do you think will need to happen in order for us to actually be able to view this object centric process mining? So for a lot of us, we're used to the standard process graph. So what from a data visualization perspective will need to change to be able to view it from multiple case angles?

 

37:10

Wil van der Aalst:

So the best way to kind of visualize it would be and I talk to you very operational. So the thing that one needs is to have tables that describe events, right? That doesn't change. But these tables, they should be connected to any number of objects. So we have we need to have an event table. We need to have, let's say, object tables which describe how objects change from one state to another state. And we need to have the connection. And it's very, like if you look at many of today's systems they typically have something like an activity on a case table where you have the requirement that every event in the activity table can refer to only one case. And that is what you should relax. And also, if you look at the typical case table, there are things that do not change, that are static. I don't know a customer as a birthdate or something like that. But it's also important that we can include state changes of the object itself. So a person has a birth date that does not change. But I don't know, a patient in hospital one day has this blood pressure and the next day has this blood pressure. So I think you can visualize this in terms of tables where you have event tables that refer to, let's say, these business events that can be relate to multiple objects. And you have object tables that show properties of objects that are either stable or changing. And that's kind of the mental image that one should have and it also shows that it's not so difficult to imagine that that is possible.

 

39:20

Jakub:

You mentioned that this is already a topic a lot in academia and that there are different research teams looking into the options on how to work with OCPM. How far are the vendors and generally the commercial usage of this approach?

 

39:36

Wil van der Aalst:

So on the academic side, there has been a lot of work. I think most of the solutions that were provided are probably too complicated. I think what we are trying to do with the OCL standard as object centric event logs system is to do something that's kind of in between the classical simple view and these very complicated models. And I think that's the way to move forward. So if you look at commercial systems, some of the systems start to embrace this idea. So for example, in Celonis there is the multi event log and in the multi event log, it's slightly different there you have, let's say, lifecycles of individual objects that you can connect to each other. So that's a slightly different view, but it aims in the same direction. What's also interesting to mention is that in Eindhoven we had the like a larger meeting on the standardization processes in the field of process mining. There is as you know, that is the XES standard and it's an official, a triple E standard for storing event data I think the standard has been fairly successful but is now also reaching its limits. There will be another. So we are currently working on another initiative to create the successor of XES, which will capture this idea that I explained in a very clean way, in such a way that it is easy for both vendors and users to adopt.

 

41:19

Jakub:

So outside of object centric process mining, what are some of the other things that we as consumers of process mining tools have generally have to look forward to in next to one to five years?

 

41:35

Wil van der Aalst:

So I think a new development will be the connection between workflow automation and process mining. And this will create a layer on top of existing systems where we should not repeat the mistake that we did before, that we can think that we can replace these complicated systems. Like if you install SAP completely, you may have 800,000 different tables. So it's very naive to think, okay, let's take a new technology and we don't need these 800,000 tables anymore, at the same time, these other systems. So take a look at Oracle, take a look at SAP etc., etc. These systems are not aware of inefficiencies, so we should not throw them away, but you should build a layer on top of them to continuously look for these inefficiencies and how you can improve them.

 

42:33

Jakub:

Well, this is a fascinating world where we are also heading as a process mining as a discipline. And I would love to end our discussion on a lighter note and I noticed recently that you wrote an article on Celonis blog, where you are basically calling out people why they should learn process mining and why is it the top skill to learn in 2022. And since I know a lot of our listeners are also people who are still at universities and deciding about their future careers. I would love if you could just use this opportunity and tell them why this is a top skill to learn and why they should actually care about process mining.

 

43:11

Wil van der Aalst:

Yes, I think everybody can see that that data is becoming more and more important simply because it's available. So many young people are very excited to learn things related to data science and machine learning. But at the same time, if you are looking for a job, you should try to think, okay, what are the skills that are really needed in the long term? I think many of the stuff that is happening in the field of machine learning will be completely automated, right? It will be like a black box: you put data in and something comes out and then you don't need to have a lot of expertise to do that. What will really be challenging where you really need to combine technical skills with domain knowledge, is process improvement, right? You cannot say, okay, here I have a car factory, I'll train a neural network and then it's all okay, but it's not going to work like that. So I think there are many jobs in the future that will become much more analytical, data driven and I think the interesting problems always are dynamic, right? If you think about health care, these processes are very dynamic, but also think of an auditor and auditor in the past could get away by taking a sample, talking to some people, and it would be okay. The future of auditing will be that you analyze everything that has happened, that you check whether it's okay, or not. So many jobs will require this combination of being data literate, that you can deal with data, and at the same time a process centric. The problems that do not require processes that are, let's say, basic static decisions. I think many of them will be automated, but the process will be not.

 

45:11

Jakub:

So, in translation. Everybody who listens. Let's learn data science, process mining and let's build a better process world.

 

45:19

Wil van der Aalst:

That's the idea.

 

45:21

Jakub:

Let's hope that we can help on this front. Wil, last question, where can people follow your work and you know, read up on everything that you are helping create and where to find out about more what you are doing right now.

 

45:36

Wil van der Aalst:

So if you if you're new to the topic, I think that is a very easy way to get started and that is the website processmining.org. So if you go to processmining.org that's a very nice easy website that links to many other websites and material that that's what can do to get it to the topic so that for people that that get started, for people that would like to dive deeper into the research thing. Just go to my website. Just type Wil van der Aalst and you will find it, that's fairly easy to find. I try to collect most of my papers on the website so that people can use it. If you want to become active, please take a look at the website of the Task Force on Process Mining. If you look at that, we are organizing all kinds of events to spread the knowledge of process mining and that is the great ICPM conference. Hey, you were in Eindhoven, you mentioned last year. This year it will be the most beautiful place in the world, that's called Bulsano it's in the Italian Alps. It's really beautiful. So that's a nice place where you can meet people and you can use the infrastructure of the triple E task force or process mining. And then, of course, that are the courses you hinted earlier, the new Celonis course. That's a very compact course that is taking let's say, all to all together, like 10 hours or so, that's a relatively compact course in one day, if you would, I would not recommend that those in one day. But if you would do it in one day and then one day, you know the basics of process mining but that also let's say more extensive courses also based on my process mining book. So there is the 2060 Coursera course which is much more elaborate. What I find very interesting also if you take a look at my book and that course is that the theory did not change. The tools changed, but the concepts and ideas and the problems did not change. And I, I think that's a good sign because it shows that that is not the hype but like a stable problem, that you get better at it each year.

 

48:04

Jakub:

Lovely. Wil, I would like to thank you again very much for accepting and for being here with us and talking about what process mining is, what kind of issues we are solving and where we are going in the future. So once again, thank you very much.

 

48:20

Wil van der Aalst:

I was very happy to be here, thank you.

 

48:23

Jakub:

For you, dear listeners, we are happy to have you as well. Please write us, leave us a review. Leave us a comment. If you have any questions, you can write us directly on LinkedIn where we are very active or you can just write down an email to miningyourbusinesspodcast@gmail.com . We are here to help you, so if you have any questions regarding process mining, please just let us know. Thank you very much. Thank you for listening and see you in two weeks with next episode of Mining Your Business podcast. Bye bye.