Diese Seite ist aus Gründen der Barrierefreiheit optimiert für aktuelle Browser. Sollten Sie einen älteren Browser verwenden, kann es zu Einschränkungen der Darstellung und Benutzbarkeit der Website kommen!
Homepage LMU Homepage LMU Department of Statistics Working Group Methodological Foundations of Statistics and their Applications
Search:
Printing Header

Fifth Workshop on Principles and Methods of Statistical Inference with Interval Probability

Munich, 10 - 15 September 2012

programme


All workdays will start at 09:00 a.m.
day subject organized by
Mon 10th Classification /
Imprecise Probability as a new perspective on basic statistical tasks

detailed programme
Lev Utkin, Gero Walter /
Frank Coolen, Thomas Augustin
Tue 11th Regression and support vector machines

detailed programme
Lev Utkin, Gero Walter
Wed 12th Evaluation and comparison of imprecise methods and models

detailed programme
Alessandro Antonucci, Andrea Wiencierz
Thu 13th Learning and updating

detailed programme
Sébastien Destercke, Georg Schollmeyer
Fri 14th Open topics

detailed programme
Tahani Coolen-Maturi, Marco Cattaneo
Sat 15th Excursion Andrea Wiencierz
  • Those arriving on Sunday are welcome to join us for an informal get together at 19:00 (7 pm)
    at "Alter Simpl" (Türkenstraße 57, 80799 München)
  • Overview of the workshop's schedule, including suggestions for dinner locations.
  • In the list of restaurants, the addresses of the suggested dinner locations and of further places for lunch and dinner can be found.

list of participants


detailed programme


All workdays will start at 09:00 a.m.

Mon 10th

Classification
Imprecise Probability as a new perspective on basic statistical tasks


Paul Fink:

Influencing the predictive ability of (bags of) imprecise trees by restrictions and aggregation rules


In a first simulation the impact of restrictions on the tree growing algorithm
[Abellan and Moral (2003)], varying values of the crucial imprecise dirichlet parameter 's' and a stopping rule induced by a minimal leaf size, are studied. The second simulation deals with different aggregation rules to combine a bag of imprecise trees. Both rules on the predicted classes and the predicted class probability intervals are considered and compared. Moreover, for both a bag and single imprecise tree are grown and they are compared alongside.
Richard Crossman:

Ensembles of Classification Trees with Interval Entropy


I want to discuss a generalised classification tree method based on the Abellan/Moral/Masegosa method, and I want to talk about adapting elements Popatov's TWIX ensemble method to the Abellan/Moral/Masegosa method, so we can tackle continuous variables properly, rather than just discretising them.
Sébastien Destercke:

Binary decomposition with imprecise probabilities


Decomposing a problem into binary classifiers is a seductive way to transform a complex problem into a set of easier ones. It is used in multiclass problems as well as in other classification problems such as label ranking or multilabel prediction. In this talk, we review the latest results about using binary classifiers with imprecise probabilities, point out the possible remaining problems, and offer some perspective on the use of such classifiers.
Lev Utkin:

An imprecise boosting-like approach to classification


It is shown that one of the reasons why the Adaboost algorithm in classification overfits is its extreme imprecision, i.e., the probabilities or weights for adaptation can be changed in the unit simplex. A way for reducing the imprecision by means of exploiting the well-known imprecise statistical models is proposed. This way is called the imprecise AdaBoost. It is shown that this reduction provides an efficient way for dealing with highly imbalanced training data. Moreover, it is shown that the reduced sets of probabilities can be changed at each iteration of AdaBoost by using for example the imprecise Dirichlet model. Various numerical experiments with the well-known data illustrate the peculiarities and advantages of the imprecise AdaBoost.
Tue 11th

Regression and support vector machines

Georg Schollmeyer:

Linear models and partial identification: Imprecise linear regression with interval data


In several areas of research like Economics, Engineering sciences, or Geodesy, the aim of handling interval-valued observations to reflect some kind of non-stochastic uncertainty is getting some attention. In the special case of a linear model with interval-valued dependent variables and precise independent variables one can use the linear structure of the least-squares-estimator to develop an appropriate, now set-valued estimator, which is explicated seemingly independently in several papers (Beresteanu and Molinari, 2008; Schön and Kutterer, 2005; Cerny, Antoch, and Hladik, 2011). The geometric structure of the so reached estimate is that of a zonotope, which is widely studied in computational geometry. In this talk I want to introduce the above-mentioned estimators, some of their properties, and two different ways to construct confidence regions for them: One way is to look at these estimators as set-valued point estimators and to utilize random set theory, the other way is to see them as collections of point estimators, for which one has to find appropriate collections of confidence ellipsoids.
Chel Hee Lee:

Imprecise Probability Estimates for GLIM


We study imprecise priors for the generalized linear model to build a framework for Walley's 1991 inferential paradigm that also incorporates an effect of explanatory variables for quantifying epistemic uncertainty. For easy exposition, we restrict ourselves to Poisson sampling models giving an exponential family using the canonical log-link function. Normal priors on the canonical parameter of the Poisson sampling models lead to a three-parameter exponential family of posteriors which includes the normal and log-gamma as limiting cases. The canonical parameters simplify dealing with families of priors as Bayesian updating corresponds to a translation of the family in the canonical hyperparameter space. The canonical link function creates a linear relationship between regression coefficients of explanatory variables and the canonical parameters of the sampling distribution. Thus, normal priors on the regression coefficients induce normal priors on the canonical parameters leading to a multi-parameter exponential family of posteriors whose limiting cases are again normal or log-gamma. As an implementation of the model we present a prototype for work-in- progress of the project at the r-forge.r-project.org which is titled `Imprecise Probability Estimates in GLM'.
Lev Utkin:

Imprecise statistical models and the robust SVM


A framework for constructing robust one-class classification models is proposed in the paper. It is based on Walley's imprecise extensions of contaminated models which produce a set of probability distributions of data points instead of a single empirical distribution. The minimax and minimin strategies are used to choose an optimal probability distribution from the set and to construct optimal separating functions. It is shown that an algorithm for computing optimal parameters is determined by extreme points of the probability set and is reduced to a finite number of standard SVM tasks with weighted data points. Important special cases of the models, including pari-mutuel, constant odd-ratio, contaminated models and Kolmogorov-Smirnov bounds are studied. Experimental results with synthetic and real data illustrate the proposed models.
Andrea Wiencierz:

Linear Likelihood-based Imprecise Regression (LIR) with interval data

Wed 12th

Evaluation and comparison of imprecise methods and models

Andrea Wiencierz and Alessandro Antonucci:

Evaluation and comparison of imprecise methods and models — A short introduction

Alessandro Antonucci:

Evaluating imprecise classifiers: from discounted accuracy to utility-based measures


Imprecise classifiers can possibly assign more than a single class label to a test instance of the attributes. Accuracy can therefore characterize the performance only on instances labeled by single classes. The problem of evaluating an imprecise classifier on the whole dataset is discussed with a focus on a recently proposed utility-based approach. This produces a single measure which can be used to compare an imprecise classifier with others, either precise or imprecise.
Sébastien Destercke:

Comparing credal classifiers: ideas from Label Ranking


In this talk, we recall the basic scheme of the label ranking problem. We then present some solutions recently in label ranking methods to measure the efficiency of classifiers returning a partial or incomplete answer. The use of such measurements to credal classifiers is then sketched briefly.
Andrea Wiencierz:

Evaluating imprecise regression

Marco Cattaneo:

Graphical comparison of imprecise methods

Georg Schollmeyer:

Evaluation and comparison of set-valued estimators: empirical and structural aspects


In this talk we investigate the problem of evaluation of set-valued estimators. We look at estimators as 'approximations of the truth', contrasting the goodness of these approximations in an empirical and in a structural manner respectively. We exemplify this along the lines of location-estimators and the problem of linear regression . Here it is useful to look also at set-domained, set-valued estimators.
Finally we try to motivate the need to satisfy structural properties at least in a practical sense and state a little lemma about the extension of undominated point-domained estimators to undominated set-domained estimators, which indicates the usefulness of set-monotonicity.
Thu 13th

Learning and updating

Sébastien Destercke:

Label ranking: interest for IP and problems (a short introduction)


In this talk, we present the label ranking problem and explain why imprecise probabilities may be useful to deal with such a problem. We also present some interesting challenges concerning decision and statistical models used in such problems.
Gero Walter:

Boat or bullet: prior parameter set shapes and posterior imprecision


In generalized Bayesian inference based on sets of conjugate priors, the prior credal set is taken as the convex hull of conjugate priors whose parameters vary in a set. Even if equivalent in terms of prior imprecision, different parameter set shapes may lead to different updating behaviour, and thus influence posterior imprecision significantly. Using a canonical parametrization of priors, Walter & Augustin have proposed a simple set shape that leads to additional posterior imprecision in case of prior-data conflict. With the help of a different parametrization proposed by Mik Bickis, Walter, Coolen & Bickis now have found a set shape that, in addition to prior-data conflict sensitivity, also reduces imprecision particularly when prior and data are in strong agreement. In Bickis' parametrization, the set shape resembles a boat with a transom stern, or a bullet.
Marco Cattaneo:

On the estimation of conditional probabilities

Roland Poellinger:

Superimposing Imprecise Evidence onto Stable Causal Knowledge: Analyzing 'Prediction' in the Newcomb Case


Referring back to the physicist William NEWCOMB, Robert NOZICK (1969) elaborates on - as he calls it - Newcomb's problem, a decision-theoretic dilemma in which two principles of rational choice seemingly conflict each other, at least in numerous renditions in the vast literature on this topic: Dominance and the principle of maximum expected utility recommend different strategies in the plot of the game situation. While evidential decision theory (EDT) seems to be split over which principle to apply and how to interpret the principles in the first place, causal decision theory (CDT) seems to go for the solution recommended by dominance ("two-boxing").

In this talk I will prepare the ground for a understanding of causality that enables the causal decision theorist to answer NOZICK's challenge with the solution of one-boxing by drawing on the framework of causal knowledge patterns, i.e., Bayes net causal models built upon stable causal relations (cf. PEARL 1995 and 2000/2009) augmented by non-causal knowledge (epistemic contours). This rendition allows the careful re-examination of all relevant notions in the original story and facilitates approaching the following questions:

1. How may causality in general be understood to allow causal inference from hy-brid patterns encoding subjective knowledge?

2. How can the notion of prediction be analyzed - philosophically and formally?

3. If all relations given in the model represent stable causal knowledge, how can imprecise evidence be embedded formally? Or in other words: How can the unreliable predictor be modeled without discarding the core structure?

4. Finally, in what way could unreliable prediction be modeled with interval probability, as motivated by considerations in NOZICK's treatise? And what should be the interpretation of such a rendition?


References:

1. Nozick, R. in Rescher, N. (Ed.): Newcomb's Problem and Two principles of Choice. Essays in Honor of Carl G. Hempel, Dordrecht: Reidel, 1969, 114-146
2. Pearl, J.: Causality: Models, Reasoning, and Inference. Cambridge University Press, 2009
3. Pearl, J.: Causal diagrams for empirical research. Biometrika, 1995, 82, 669-688
Fri 14th

Open topics

Atiye Sarabi Jamab:

A Comparison of Approximation Algorithms in Dempster Shafer Theory based on New Basis Dissimilarity Measures


Computational complexity of combining various independent pieces of evidence in Dempster Shafer theory (DST) motivates the development of approximation algorithms for simplification. In approximation algorithms, some approaches consider special types of evidence such as working on the quality of the belief functions to be combined. Another category of approaches is composed based on Monte-Carlo techniques, where the idea is to estimate exact values of belief and plausibility by comparing the different outcomes relative to randomly generated samples. The last category tries to reduce the growing number of focal sets during the combination process by simplification.

Many approaches are introduced to improve the efficiency of computational methods, and many analytical and numerical studies propose different distance measures and benchmarks to investigate and compare the approximation methods. While many distance measures can be found in the literature, the experiments show that the information content of these distance measures are highly over lapped. In this talk, first through a thorough analysis of dissimilarity measures, a set of more informative and less overlapping as the basis dissimilarity measures will be introduced. This basis will be used to investigate and compare the quality of approximation algorithms in Dempster Shafer Theory. To this end, three benchmarks along with the classic combination benchmark will be proposed. Existing Approximation methods will be compared on them and the overall qualitative performance will be summarized.
Robert Schlicht:

Dual Representation of Convex Sets of Probability Distributions


Sets of probability distributions appear in various contexts in both statistics (e.g. as parametric models) and probability theory (e.g. probability distributions determined by marginal constraints). They also present a form of interval probabilities with a particularly nice (linear) mathematical structure. Specifically, closed convex sets of probability distributions have equivalent representations as closed convex cones in a function space and, moreover, as preference relations between gain (or loss) functions.
In the first part of the talk, the mathematical background, essentially amounting to classical results from integration theory and functional analysis, is presented in a general form. Next, the relationship to imprecise probabilities and statistical decision theory is discussed. The last part of the talk explores several applications, including the Kolmogorov extension theorem, stochastic orders, transportation problems, and conditioning.
Marco Cattaneo:

Likelihood-based imprecise probabilities and decision making


Likelihood inference offers a general approach to the learning of imprecise probability models. In this talk, we consider properties of the statistical decisions resulting from these imprecise probability models, in connection with decisions based directly on the likelihood function.
Damjan Škulj:

Calculations of the solutions of interval matrix differential equations using bisection


Computation of the lower and upper bounds for the solution of interval matrix differential equation is generally computationally very expensive. Such equations usually appear in modelling continuous time imprecise Markov chains.
I will propose a method that in some important cases reduces this computational complexity.
Andrey Bronevich:

Measuring uncertainty and information of sign-based image representations



short summary of programme points

day subject organized by
Mon 10th Classification /
Imprecise Probability as a new perspective on basic statistical tasks

During the last two decades, research in imprecise probability (IP) has seen a big increase, including substantial attention to statistical inference. The current state-of-the-art consists mostly of methods which, effectively, use sets of models simultaneously. While this has led to sophisticated methods with interesting properties, it may not have too much to offer to applied statisticians, who may consider the IP-based methods even more complex to understand and to use than the more established approaches.
It would be great if IP could in addition lead to statistical methods that are better and easier to apply, for non-experts, than the established methods, or at least either of these without being (substantially) worse on the other. During this workshop day, we look for ideas and discussions about such possible IP approaches, with specific focus on basic statistical problems.
One can think e.g. at the t-test or chi-square test, which are very often applied (even by non-experts) but have underlying assumptions that are often not really taken into account carefully. Could IP provide statistical methods that can be applied, and ideally understood, by non-experts and that are "better"?
A further possible area where IP-based statistical methods can have substantial impact is for more complex models, where allowing imprecision might lead to simplifications that could be attractive in real-world applications. For example, one could consider IP models in which some aspects of more detailed models are left unspecified, and the resulting inference may already be sufficient or could indicate that more detailed specification is required.
It will be crucial for wider success of IP-based statistical methods to be attractive to applied statisticians and non-expert users, we aim at some focus on this in presentations and discussions.
If you want to make a scheduled contribution to this day, please email us by Friday 31 August with a clear indication of your intended contribution, that is which challenge(s) you wish to address and how, what issues your presentation will raise for discussion, and how much time you would ideally have, indicated as "minutes for presentation + minutes for discussion". Also, if participants can prepare for your presentation and discussion, it would be useful to give an indication how (e.g. some references or an explicit question).
Lev Utkin, Gero Walter /
Frank Coolen, Thomas Augustin
Tue 11th Regression and support vector machines

Lev Utkin, Gero Walter
Wed 12th Evaluation and comparison of imprecise methods and models

On this day of the workshop we will discuss how to evaluate and compare statistical methods, in general, and, more specifically, imprecise methods and models for classification and regression.
When based on imprecise methods, classification algorithms can possibly assign to an instance a set of class labels instead of a single one. Measuring the quality of such (partially) indeterminate results by single numbers is important to fairly compare different models. An open discussion about the results and the challenges in this field will be presented.
Furthermore, several imprecise methods for regression have been proposed in the recent years. These methods generalize precise regression in different ways. To evaluate and compare imprecise regression methods it is important to characterize them by their statistical properties like, e.g., coverage probability, consistency, and robustness. However, in some cases the notions are too narrow to be directly applied to the generalized method. We will discuss possible generalizations of these notions and explore the statistical properties of selected imprecise regression methods.
We will be very happy about anyone interested in taking part in the discussion. If you want to make a contribution to this day, please let us know by Friday 31 August 2012.
Alessandro Antonucci, Andrea Wiencierz
Thu 13th Learning and updating

The problem of learning or estimating a statistical or a probabilistic model is an old one. The estimated model can then be used for various purposes (uncertainty propagation, regression and classification, ...).
This day of the workshop is devoted to the problem of learning within imprecise probability (IP) frameworks. Two particular questions that arise in this situation are
(1) what is the interest of using IP in my problem? and
(2) Is there an approach to handle the learning problem efficiently?
These questions become even more critical when problems get complex, i.e., when learning multidimensional models, models with an high number of parameters (e.g., mixture models), models of structured data (rankings, networks, ...) or when dealing with situations where uncertainty or inconsistencies are particularly severe (e.g., rare events, conflict between data and expert knowledge, uncertain data).
We aim at focusing at such questions on this day, with discussions and presentations about practical and theoretical challenges. Discussions about problems specific to the IP field (e.g., coherence in updating) are also welcomed.
If you want to make a contribution to this day, please let us know by Friday 31 August 2012 by providing us with a title and short description of your intended contribution. An estimation of the time you would like to have is also welcomed.
Sébastien Destercke, Georg Schollmeyer
Fri 14th Open topics

Tahani Coolen-Maturi, Marco Cattaneo
Sat 15th Excursion

Andrea Wiencierz
Last modification: Georg Schollmeyer,