Imperial's community of skilled data science pioneers can provide solutions using a huge range of expertise:
Machine learning - Computational finance, pattern recognition, probabilistic inference and support vector machines
Statistics - Bayesian data analysis, medical statistics, methodology assessment, probability and statistical inference
Data Analytics, modelling and evaluation - Data cleaning, data mining, data quality and data visualisation, integer programmes, Markov processes, analytics, analytic reasoning and semantic extraction techniques.
Big Data - Bioinformatics, data fusion, genetic algorithms, hardware and software infrastructure and time series analysis.
Scalable and efficient analysis of data
Using novel methods and algorithms, our experts can conduct Big Data analysis for compute clusters or high-performance compute infrastructure (i.e. supercomputers). They can also implement new analytical approaches and algorithms for commercial applications - such as spatial analytics for Transport for London or medical analytics for start-ups - plus scientific applications in medicine, neuroscience, physics and more.
- Thomas Heinis - expertise in Big Data, distributed Processing, scientific data management, spatial data, spatial indexing, data analytics, high performance data analytics, data management on novel hardware.
- Dr Jonathan Passerat-Palmbach - expertise in Big Data (Spark), distributed processing, scientific data management, high performance data analytics, data management, numerical reproducibility, software engineering (Scala and functional programming), devops (Docker).
Big Data and Statistical Machine Learning
Our experts are developing machine learning techniques that can handle modern data types, such as free text, and draw on statistical and computational intelligence techniques to navigate vast amounts of information, like distributed databases or data streams, with minimal human supervision.
They have also developed Bayesian anomaly detection methods to protect high volume data streams and large dynamic computer networks against cyber-attacks and fraudulent activity. See Statistical Cyber Security Analytics
- Prof Niall Adams - expertise in classification, data mining, streaming data analysis and spatial statistics for bioinformatics
- Dr Nick Heard– expertise in computational Bayesian inference, cluster analysis, graph analysis and topic modelling for large dynamic networks such as computer networks or social networks - and bioinformatics problems.
- Dr Marina Evangelou – expertise in Bayesian statistics, machine learning and network analysis.
- Dr Roberto Trotta - develops advanced statistical and numerical methods for the analysis and interpretation of complex data from astrophysics, cosmology and particle physics which underpin statistical consultancy and custom-made data analysis solutions. He also works as a scientific consultant with museums, writers, film makers and artists, providing the help and support they need to make their artistic creations scientifically sound.
Privacy - and use of data
Large-scale datasets (such as mobile phone logs, credit card usage, browsing metadata, membership or customer sales information) offer huge insight into the location, habits and requirements of people without the need for questionnaires. However, anonymity and privacy issues require organisations to gather and store it securely, thus restricting the use of data that could plot trends, identify needs and better understand societies on a large scale.
Our experts can help with solutions to secure your data and use machine learning techniques to gain invaluable insight - and data driven customer segmentation for marketeers. Some examples...
- using behavioural indicators they can predict people's personality up to 1.7x better than random to help organisations better understand their customers.
- using 4 spatio-temporal points they can uniquely identify the location of 95% of people in a mobile phone database of 1.5M and 90% of people in a credit card database of 1M.
- using telco data to provide an insight into the spread of infectious diseases, strategies into micro-target outreach and driving health-seeking behaviour.
What's more, data from separate datasets can be matched to provide a broader and more informed picture of an individual. For example, by matching data from different departments or registers within an organisation, they can create an overarching profile of a customer so their needs can be quickly and effectively assessed.
- Yves-Alexandre de Montjoye - expertise in mobility traces, computational privacy, metadata, causal inference, Big Data, behavioural modelling, computational social sciences.