FAQs

What is Predikto?

Predikto is a rag-tag band of super-smart data scientists and computer engineers (plus some above-average managers and sales staff). Fueled by heavy amounts of coffee and sarcasm, distracted easily by Slack channels, and driven by the relentless pursuit of knowledge, we have developed the amazing Predikto Enterprise Suite that can spin data into actions.

What is The Predikto Enterprise Suite?

The Predikto Enterprise Suite is designed to ingest large-scale, distributed Industrial IoT sensor data along with extensive maintenance records covering a wide variety of industrial subsystems. Gathering large amounts of disparate data into one platform allows for formerly siloed inputs to drive powerful real-time predictive processes. This allows our customers to turn unplanned, disruptive events into routine, planned actions.

What is MAX?

Predikto MAX™ is the prediction engine at the heart of Predikto’s data analytics platform. MAX™ has been the heart-and-soul of Predikto since the company’s founding. It includes automated frameworks for feature scoring, model selection, and threshold optimization. These frameworks allow us to continuously improve model inputs and intelligently tune model outputs to improve performance: providing the best results possible.

What is Machine Learning, Artificial Intelligence, Predictive Analytics?

Predictive analytics (PdA) is an umbrella term for an area of statistics where the goal is to use historical data to derive predictions about future events. Machine learning (ML), an advanced form of PdA, involves a very broad array of inferential multivariate statistical techniques. There are specialized ML techniques for the prediction of specific events (e.g., machine failure versus non-failure), some examples include indicating classes of events (e.g., high, medium, or low production yield) and detecting patterns within some data specific to some a phenomenon (i.e., machine deterioration pattern). Machine learning solutions are trained using historical input data (i.e., predictor variables) in order to make predictions on an output (i.e., whatever is being predicted). In any case, the data used to make the predictions occurs before whatever it is that is being predicted.

For example, one might have data (i.e., failure history) on motors and wants to predict the probability of motor failure with a 1-hour prediction window. Feature data (aka independent variables or predictor variables) are recorded on the hour for each motor’s average RPM, vibration, temperature, and amperage. The ML model will use these data points to learn patterns and interplays between the features and make predictions on failures occurring in the period following the readings, in this case with a 1-hour advanced window. More specifically, in a deployed ML solution, each motor’s feature data for hour 12345 are available and the ML model will use this data to make a prediction on failure for hour 12345 for each motor, respectively.

In the end, the quality of any PdA solution is contingent on the quality of the data and how features are engineered (i.e., measured, managed, transformed, etc.). All machine learning solutions should be rigorously validated with “new” or “unseen” data in a test, or simulated, environment before their performance can be assessed. Once validated, one can assess how quickly a return on investment (ROI) can be expected, if any. A ML based PdA solution that is driven by quality data, tactful feature engineering (i.e., that which is driven by theory, logic, and empirical relationships) will tend to outperform human intuition provided there is a reasonable degree of correlation between the feature variables and whatever it is that is being predicted. It is not uncommon for a good PdA solution to produce an ROI very swiftly, and this can be done at a surprisingly low cost.

What kind of data is required?

You might have “big data” or you might not. Regardless, you need good data that include information about whatever you’re trying to predict. Without good data, predictive analytics won’t help you predict what’s going to happen in the future. Your data might even come from disparate sources that have never been linked together, like maintenance records blended with sensor readings from a SCADA system, or from PLC boards. Some Predikto customers provide a few megabytes of data every week, others provide several gigabytes on an hourly basis. The Predikto Enterprise Suite automatically joins, merges, and transforms the data to identify the best performing predictive analytics solution achievable, without the requirement for an internal data science team.

At a minimum, customers need data related to whatever it is they want to predict. If you’re interested in predicting when a machine breaks, your data needs to include a log for a reasonable number of occurrences when the machine broke, and any other related data about that machine for that same time period. For example, you might have data specific to maintenance records, usage details, operator information, and information about the asset itself (e.g., when it went into service, the make/model, etc.). If these data points can be linked to a particular asset at a particular time, you are on the right track.

Some customers provide data only from equipment sensor readings and we detect performance patterns in advance, like predicting poor asset health, anomalies, or when equipment will fail. In such cases, surprisingly little data can be quite useful.

What is an ETL?

Extract, Transform, Load. The process of taking multiple data sources, formatting them correctly, and then ingesting and storing them in unified data schema. By relating the data to a common key, or asset, it can be utilized for Predictive Analytics.

I have Big Data. Now what?
Big data is good, but it will never be great if it’s not used in a way that results in an improvement to your business. To advance from just having big data to actually using it to make good predictions you must:
  • Identify what it is you want to do with the data and you can gain from it
  • Know how to organize and combine the data for a specific purpose
  • Identify which approach works best, there many approaches to choose from
  • Identify a team to make this happen
  • Tweak the solution once you’ve begun to change your operational approach
  • Maintain the solution
Everything above is doable, but this is a slow and expensive process. The Predikto solution can do all of this for less than the cost of one data scientist.
How do you go from data to a predictive model?

There are many steps to creating a predictive analytic model. First, features must be scored and selected. Features are the input data that is most relevant to the event of interest. Then, these features are used to train models to predict events of interest. There are many different machine learning algorithms that can be used as potential models, and care must be taken to select the best performing model. After a model has been generated, thresholds must be set to determine when to notify if an event is going to occur. Again, this takes research and calibration. Choosing the best machine algorithm for an application often requires significant domain expertise, extensive knowledge of historical simulations, and large computational resources. Due to this, features, models, and thresholds, are typically decided upon during set-up time, and remain static, despite the information they process dynamically changing. Predikto has automated over 80% of the process, thus, avoiding the “set it and forget it” trap.

What is the Industrial Internet of Things (IIoT)?
The IIoT is a phrase used to represent the rapidly developing network of industrial assets (i.e., equipment) that are connected to a network. Through this connectivity, an industrial asset can automatically transmit various types of data in real-time:
  • Sensor readings (e.g., vibration, temperature, movement, etc.)
  • Diagnostic trouble codes
  • Location
  • Status
These data points are stored in some fashion and might be considered “big data”, but don’t have to be that big to be useful. The types of equipment rapidly becoming a part of the iIoT include, but are not limited to:
  • Consumer vehicles
  • Locomotives
  • Mobile equipment
  • Production line components
  • Pumps, motors, etc
Any device that automatically transmits data to the internet or to some network, and has an industrial application is part of the IIoT.
Why should industrial operations care about PdA?

You serve your customers by providing a valuable service. You depend on your equipment to provide these services in a timely manner. When your equipment is not operational you are wasting time and losing money. Predictive analytics turns your unplanned, expensive, down-time into managed, planned, preventative maintenance. Your assets stay healthy. You deliver to your customers on-time and within budget.

What can Predikto do for me?

Predikto can turn your IIoT data into actions. We focus on your data so you can focus on your services. Predikto will work with you to identify unexpected pain points in your process that you wish to eliminate. From there, we will help you identify data sources (internal and external) that provide insight into your operations and assets. We will ETL this data, consolidating it around the assets to monitor, and storing in a data lake. Our data scientists then shepherd your data through our mainly-automated PdA process, from feature scoring, to model selection, and finally, to predictions. These predictions are then turned into notifications, alerting you to future events, and providing the information needed for you to take action before an unplanned event occurs.

What do I need to be successful with PdA and Predikto?

We come to the table with data scientists and engineers. We rely on you for your intimate knowledge of your customers. We work with you to identify pain points in your process that are causing you to waste valuable time and money. You need to have sensor data, and maintenance data that links to the pain points (for example, work orders logged by asset ID, and also IoT data related to each asset). Most importantly, you need to have pain points that are time driven. Your events must occur at discrete times. For example, we can predict engine failure, or machine overload, because it occurs at one point in time. We cannot predict fraudulent bank statements because they are a single anomaly independent of time.

How does Predikto make PdA Painless?

We do the heavy lifting. All we require from you is data, located where we can access it, and for you to define what you want to predict. Once we have that, we set up the ETL, the models, and the outputs. We have API access for both data delivery to our system and results delivery from our system.

What if I have sensor data, but my equipment rarely fails?

Predikto thrives on identifying rare events in complex data. Your event must have happened at least 10 to 50 times, over your entire fleet, within your data set. If not, we must look for more common precursors in your data. We need to have enough event signatures in the data to train our models. For example, nuclear meltdown is not possible to predict, but we can easily detect early warning lights.

How does Predikto get my data automatically?
We provide your IT team with APIs that can receive your data over a secure cloud network, or we can install our solution behind your own private virtual cloud network. Once deployed, we receive your data automatically, transform the data for use in the predictive analytics solution, and provide you with results in whichever format is most useful to you (e.g., an email with a .CSV, an SMS message, direct link to your data management system, a web dashboard, etc.).
How does Predikto operationalize results?
Predikto works with you and your SMEs to understand your current maintenance workflow. We provide an API endpoint, so you can put our notifications into your current system. We work with your fleet managers and maintenance managers, to determine the best way to insert our information into your current flow. Our customer success team works with you closely after delivery, to make sure our results are understandable, and also to help you build a modified workflow.
What does my company get from Predikto with a deployed solution?
The predictions can come to you however you like. Some customers want an automatically generated email that contains a .CSV file with a list of assets predicted to require maintenance during a given time window. Others want an SMS message or a link to their in-house data infrastructure so they can automatically generate a work-order ticket.
Why is Predikto special?

We try more things, faster, to hone in on the best solution. Anyone can ETL data. Anyone can implement machine learning. Predikto has automated feature creation, feature scoring, and model selection. By expanding the known universe of potential features and potential models, we are selecting the best methods to meet your needs. Other companies find a local maxima. We find an optimal maxima.

This automation prevents your models from getting stale. Most models are “set it and forget it” – the huge capital outlay to develop them precludes easy evolution. Automated selection enables us to constantly try new features and models, so that our predictions evolve with your changing data and new inputs.

What is automated feature engineering and model selection?

We have developed methods to dynamically create thousands of potential features from a data set. These features are then scored on dozens of candidate machine learning models. By scoring potential features, and potential models, we evaluate their effectiveness, and incorporate the best selections into a model’s training procedure. By churning through a multitude of potential features and models in a short time-frame, we provide dynamic, accurate, models that change with your data and your needs.

What is automated threshold optimization?

MAX™ is configured to automatically determine when a prediction should trigger actionable notifications. On a per-model basis, sophisticated algorithms are used to tune models according to the historical performance of the notification scheme. The system is configured to meet the precision and coverage needs of the individual customer.

What is the difference between static and evolving models?

Data is always changing. Your models must change with it. Due to high labor costs, models are typically set during the training period. These models are said to be static but, they run on data in real time. Over time, as the data changes (even subtly) the performance of these static models degrades. Dynamic, evolving models, change and tume themselves along with the changing data and conditions.

What models do you use?

The short answer: all of them. Academics have spent years developing a cache of amazing machine learning models that are available to everyone. We have automated our model selection process so that we can easily test on numerous algorithms and choose the right one. It matters less which model you use, and more, that you have chosen the right one for the right task. As a part of our process, we constantly recheck the models being used, and can automatically change to a new one if the data and conditions require it.

How do you scale?

We scale via automating a previously manual task. By developing automated frameworks for a majority of modeling tasks, our small band of data scientists can oversee a complex array of computational tasks. We have gone from a worker at each station in an assembly line, to an army of robots overseen by a foreman.