Machine Learning Development Standards at The Data Analysis Bureau

Date

share:

Share on facebook
Share on twitter
Share on linkedin

As the data science and machine learning specialists, we are known for building data solutions that include machine learning (ML). We help clients and partners with delivering intelligent solutions through an end-to-end service; from ideation and strategy through to delivering models running in production.

In our engagements, we typically receive a number of questions about how to start a project, how to scale a project or to what’s needed in the next stage of development. In this series, we’ll be sharing our project insights, guidance and recommendations.

 

Machine Learning Development Framework

For our machine learning projects, we have developed a framework that breaks down the development process into 4 stages, representing each level of technological maturity. We call these stages development standards (see Figure 1), and they refer to the technology readiness levels developed by NASA and used extensively in European and US innovation projects. Figure 2 shows how our development standards map to the technology readiness levels.

In this article, I will describe our 4 core development standards and in a later article, I will describe more precisely the technical actions and analyses we use for each of them.

T-DAB Development Standards in Detail

Our Machine Learning Development Standards

Proof of Concept (PoC)

Using a representative sample of data, we extract general insights and apply a quick fit of machine learning models to check whether the data contains enough structure for the viability of a full machine learning project, and what types of analytical solutions the data is most appropriate for. We also quickly assess the quality of the data and provide recommendation for modification in data collection.

For the client, it means that they get an idea of what sort of insight they can get from their data, whether or not it is suitable for further ML applications, and how to make modification to their data collection process. This often yields actionable insights to demonstrate the value in embracing machine learning.

You can read an example of demonstrator in a predictive analytics case study we did for a leading packaging manufacturer here.

 
Demonstrator

During the demonstrator phase, we extend the work done under a PoC, and spend more time on developing the machine learning models. We also offer the possibility to use more data, accessing it from a database. The aim is to demonstrate the functionality of the machine learning models and to provide examples of how the outputs can be used (for example on a dashboard).

For the client, it means they get to have an idea of the value that the ML can add to their data with concrete examples.

 
Prototype

An extension of the ML demonstrator, but now with a lot more data engineering effort, testing all elements of a functioning architecture to produce a fully functioning prototype. On the ML side, more feature engineering and model tuning is done to reach a better performing model. Typically, it is then able to run on a pre-production environment.

For the client, this means they can see how the model would work, the level of performance they would get, and the benefits it would add to their business.

An example of a prototype is the digital twin of a sailing boat we built to improve autopilot technology

 
Minimum Viable Product (MVP)

One of the final stages before a product release, extending the technology prototype consists of more data engineering efforts to reach a fully automated robust solution with a ML model performance viable to be deployed into production environment to deliver business impact.

Here, the client gets the full benefits of having a ML model running.

Machine learning components of the development standards

Progressing through the development standards is an iterative process. We go through our ML pipeline for each of them, with increasing degrees of time and effort, as the level of complexity increases.

This table summarises the elements from our ML pipeline we usually apply for each development standard. Note that we use this table as a guideline, all of these elements might not always be suitable to the problem, and some other problems require specific steps not described here, and so we adapt to the situation.

Table listing the machine learning components of The Data Analysis Bureau's development standards
Table 1: Machine Learning components of the development standards

If you want to find out more about our machine learning development standards, whether to apply them to your own projects or practice or if you want to get started with us, download the standards here.

Do you want to find out more?

More
articles