Statistics for big data
Wednesday 17th September 2014
Thank you to those that attended. Slides for the first three talks are available to view again at the foot of this page (visible to members only):
The IBS-BIR will hold a meeting on the theme of "Big Data" from 1.30pm on September 17 2014 in London. Registration has now closed, so please email doug.speed
The aim is to bring together a set of researchers with applied and/or theoretical work in large scale dataset analysis.The meeting will try to highlight links and shared methodological challenges across different research areas. The schedule is:
1.30 pm; Jaakko Peltonen, Aalto University: Lost in Publications? How to Find Your Way in 50 Million Scientific Documents
2.15pm; Tom Thorne, University of Edinburgh: Bayesian nonparametrics and biological networks
3.00-3.30; Coffee and Tea
3.30; Finn Lindgren, University of Bath: Big models for structured environmental big data
4.15; Yoram Bachrach, Microsoft Research Cambridge: The Human Manifold: On the Predictability of Human Online Behaviour, its Consequences, and what Facebook data mining can tell us about it
See below for abstracts and bios for each speaker.
Registration costs (which include refreshments)
£25 for IBS members
£10 for IBS student or retired members (student membership of IBS is free)
£40 for non-IBS members
For membership information, visit http://bir.biometricsociety.org/membership
To register, click on the link at the top of this page; payment can be made via Paypal, debit card or Cheque.
The four talks will be held in the Roberts Building Lecture Theatre, Room 106 on the UCL Main (Gower Street / Bloomsbury) Campus
The nearest underground stations are Euston Square, Warren Street or Russell Square (all within 5 minutes walk), while the closest overground stations are Euston (5 minutes walk) and Kings Cross (15 minutes). For directions to the Main Campus see here http://www.ucl.ac.uk/maps
The lecture theatre is in the Roberts Engineering building, which is immediately on the left as you enter the South Entrance to campus at Malet Place, off Torrington Place, indicated by the red spot on this map http://crf.casa.ucl.ac.uk/screenRoute.aspx?s=386&d=129&w=False
Jaakko Peltonen, Aalto University: Lost in Publications? How to Find Your Way in 50 Million Scientific Documents
Researchers must navigate big data. Current scientific knowledge includes 50 million published articles. How can a system help a researcher find relevant documents in their field? We introduce IntentRadar, an interactive search user interface and search engine that anticipates the user’s search intents by estimating them from the user’s interaction with the interface. The estimated intents are visualized on a radial layout that organizes potential intents as directions in the information space. The intent radar assists users to direct their search by allowing feedback to be targeted on keywords that represent the potential intents. Users can provide feedback by manipulating the position of the keywords on the radar. The system then learns and visualizes improved estimates and corresponding documents. IntentRadar has been shown to significantly improve users’ task performance and the quality of retrieved information without compromising task execution time
Tom Thorne, University of Edinburgh: Bayesian nonparametrics and biological networks
There are a number of ways that we can build more flexible models of biological networks by applying methods from the field of Bayesian nonparametrics. These methods allow us to include covariates such as time, and to infer models whose complexity is able to naturally scale based on the observed data. I will discuss applications of Bayesian nonparametric methods to existing biological data, as well as describing the implementation and computational aspects of the inference procedure.
Finn Lindgren, University of Bath: Big models for structured environmental big data
Large quantities of data do not imply a large information content. Remote sensing data provide higher global and regional coverage for current environmental observations, which helps in estimating the spatial dependence structure, but for long term trends the data are spatially sparse and opportunistically sampled. Hierarchical models based on stochastic processes can be used to handle such situations, incorporating problem-specific knowledge in order to avoid bias due to non-designed data collection. Advanced numerical methods and models are essential for computationally efficient construction of appropriate measures of uncertainty for estimation, reconstruction, and prediction.
Yoram Bachrach, Microsoft Research Cambridge: The Human Manifold: On the Predictability of Human Online Behaviour, its Consequences, and what Facebook data mining can tell us about it
Online social networks have had an enormous leap in popularity in recent years. We show how one can use relatively simple machine learning techniques to analyze information from social network profiles and make surprisingly accurate predictions regarding properties of the profile owner, including gender, age, personality, intelligence, happiness and even sexual orientation, religious or political views and use of addictive substances. Finally, we discuss some of the implications of these results.
Jaakko Peltonen is an adjunct professor at Aalto University in the Department of Information and Computer Science, and also an associated professor of statistics at University of Tampere. Jaakko's group research Statistical Machine Learning and Bioinformatics and interests include probabilistic generative and information-theoretic methods and formalisms such as information retrieval based dimensionality reduction, especially for application in visualization, clustering, and bioinformatics. In the past, Jaakko has also programmed computer games and is most well-known for a graphical interface for the computer role-playing game NetHack, called Falcon's Eye, as well as composing soundtrack music.
Finn Lindgren is a reader in Statistics at the Department of Mathematical Sciences, University of Bath. His research topics include computational statistical inference, spatial modelling, Bayesian hierarchical modelling, Gaussian Markov random fields and stochastic partial differential equations, while he is also interested in applications in climate modelling, ecology, general environmetrics and geostatistics.
Thomas Thorne is currently a Chancellor’s Fellow in the School of Informatics at the University of Edinburgh. His research interests focus on applications of statistical methodologies to problems in Systems Biology and Bioinformatics, including graphical modelling of gene regulatory networks, statistical analysis of protein-protein interaction networks, building flexible biological models using Bayesian nonparametrics and GPGPU methods for large scale inference.
Yoram Bachrach is currently a researcher at Microsoft Research Cambridge and is part of the Online Services and Advertising and Applied Games group. His work focuses on the application of large scale machine learning and probabilistic modelling techniques to a wide range of problems including online advertising, web search, and games.
|Document downloads for IBS members.|
Join us now.
|Talk 1: Jaakko Peltonen, Lost in Publications?|
|Talk 2: Tom Thorne, Bayesian nonparametrics and biological networks (Part 1 of 2)|
|Talk 2: Tom Thorne, Bayesian nonparametrics and biological networks (Part 2 of 2)|
|Talk 3: Finn Lindgren, Big models for structured environmental big data (Part 1 of 2)|
|Talk 3: Finn Lindgren, Big models for structured environmental big data (Part 2 of 2)|
Existing members can login below to view all site content. Lost password?
Other visitors might be interested to learn more about the benefits of membership.