Data engineering means building the systems to bring their data to users.
In practice, this involves replicating all your company’s data in a single system. It allows the business to access this data easily, reliably, safely and systematically.
Over the past two decades, we have built such systems for clients and employers ranging from startups to multi-billion dollar conglomerates and a global top 100 website.
We offer two types:
- rapid: we replicate your two most important sources of data onto our standard setup, and hand you the keys. This can be done in weeks.
- thorough: we work with you to replicate all your data sources in the tech stack of your choice.
We can help you understand complex situations using statistical learning (also known as or related to artificial intelligence, machine learning and data science).
This is broadly split into unsupervised learning, also known as finding patterns in the data without knowing what we are looking for; and supervised learning, where we attempt to predict a known objective such as the value of a property or whether an email is spam.
The output of both types of learning can further be split into prediction and intuition.Prediction is about getting an unknown value: what is this house going to sell for? Intuition is about understanding a complex situation: what drives the price of houses?
- Do you have a list of all data sources?
- Do you know who has access to each?
- Do you log all activity on these sources?
- Are the systems the data lives on and goes through secure?
- Do you encrypt all your data sources, and traffic to/from them?
- How often do you back them up?
- Do you have row/column level security?
- How do you manage your keys?
- How about your client certificates?
- Do you list seed to track leaks?