ST 790 (001) Fall 2022 Advanced Special Topics

Potential topics and resources

Below is a list of ideas for potential course projects somewhere in the intersection of statistics, machine learning, and imprecise probability. Some of the ideas are on the theoretical side, some are more methodological or computational, and some are borderline philosophical. It doesn’t matter though, the project can be pretty much anything—basically whatever interests you. My list is definitely not comprehensive, these are just things that (a) I know something about and (b) came to mind as I was preparing this list. So please feel free to come up with your own ideas. You’re also welcome to talk with me about your interests and what you have in mind.

I was originally planning to prepare a static PDF document that listed out some suggested project topics, references, etc., but I found the idea of listing everything out at once in a single document rather daunting. So, instead, I’m going to list out ideas and references here as I think of them. When I’ve made an update, I’ll change the “Last updated” date below. The order in which things are listed isn’t relevant.

Last updated: 09/26/2022

Inferential models (IMs)

I have to suggest this as a possible project idea because this is specifically where my interests lie. Some new and general ideas are being developed now, but this is intended to encompass all of the approaches that have been suggested so far, so there’s no harm in investigating some of the “old” formulations. Below are some miscellaneous thoughts or things that I’ve considered before but didn’t have time to dig into myself. If anything here is of interest to you, then please talk to me—I can help you get oriented.

Basically any statistical problem that’s interesting to you is fair game. Some work has been done on a number of statistical problems, but a thorough investigation and comparison with other existing methods has been done on very few problems, so this is wide open. A few examples in which things have been done include meta-analysis, censored data, prediction, mixed models, linear regression, and non-regular problems (those for which the “usual regularity conditions” fail); even for these, there’s opportunities to dig deeper, e.g., some more complex survival data problems, like Cox proportional hazards models would be interesting. Besides these, problems that haven’t been touched yet (at least not in a satisfactory way) include discrete-data problems (contingency tables, binary & Poisson regression, etc) and mixture models.
Related to the above point, very little has been done in connecting IMs to machine learning. As far as I know, the only work on this is has been the two related papers (here and here) on conformal prediction. There’s probably still some more than can be done with conformal prediction, but I’ve not thought much about this recently. So this is almost completely open, anything in the intersection of IMs and machine learning is fair game.
Quantifying uncertainty about the model itself is an important problem, one that has not been seriously explored yet. I started working on it here, but what’s missing is some penalization on the complexity of the model; I have some ideas that are related to the notion of “partial priors”…
An interesting but seemingly very specific problem is inference on the maximum of a normal mean vector. This fits under the category of “non-regular” inference because the max operator is not a smooth function of the underlying parameter. This is practically relevant because the non-regularities can emerge in unexpected places. One relatively recent case is in the context of dynamic treatment regimes (e.g., Chapter 3.5 in the book by Tsiatis et al), where the max operator appears naturally. A nice generalized fiducial solution to max-normal-mean problem is in this paper; it’d be great to see what the IM framework could do for this and other non-regular problems.
More coming soon

Other brands of statistical inference

At the ISIPTA’19 meeting in Ghent, Professor Gert de Cooman was asked what is the biggest open problem in imprecise probability and he said “Statistical inference.” So there are lots of interesting things that can still be done. A good reference for about what was known about statistical inference and imprecise probability, around the early 2010s, is Chapter 7 of the Augustin et al Introduction to Imprecise Probability text. Some more specifics:

Are there any imprecise-probabilistic aspects of Hannig et al’s generalized fiducial inference? A recent paper attempted to flesh out some of these connections but I wasn’t convinced by many of the claims these authors made; see my comments. I believe that there are some connections, but I imagine that these would come in the form of approximations. For example, is the generalized fiducial distribution an approximation of some meaningful imprecise probability? What do you get when applying the probability-to-possibility transform to the generalized fiducial distribution?
There has been extensive work on statistical inference using Dempster–Shafer theory, including the work by Dempster and by Shafer. There are a number of interesting papers by Thierry Denoeux and co-authors (e.g., here, here, and here), among many others, and it would be interesting to see how these frameworks, and their particular instantiations in real examples, compare to (generalized) fiducial, IMs, etc. There’s also an interesting framework related to Dempster–Shafer, called the theory of hints (due to Kohlas & Monney) that would be worth exploring; unfortunately, I don’t know too much about this.
Generalized Bayes is a framework that’s (obviously) related to Bayes but falls closer to imprecise probability; we’ll cover this in class around Week 07. For the simplest formulation, suppose there is a credal set of prior distributions. Then, for every prior in the credal set, one can apply Bayes’s rule to get a posterior, resulting in a credal set of posterior distributions. One can derive lower and upper posterior distributions based on infimum and supremum over the class of ordinary posterior distributions—this is generalized Bayes’s rule. This is not unlike robust Bayes, which has been mentioned on several occasions in the lecture. Chapter 9 of Augustin et al has some details and references about this. Peter Walley was a leader in this area, I’m not familiar with the recent literature. Walley’s imprecise Dirichlet model has been given considerable attention in the literature and, depending on time, I may present some details about this in lecture.
Something relatively new that’s come along is the safety-focused framework developed by Peter Grunwald and colleagues, along with the related work by Aaditya Ramdas and his colleagues. (Here is a draft of a review paper Aaditya sent me that would be a good place to start if you’re interested in this.) The “safety” aspect is related to having certain frequentist guarantees in finite samples, not just asymptotically; see, e.g., here. In light of these guarantees, and the characterization in my recent paper, I can’t help but believe that there’s an IM/imprecise-probabilistic aspect to this. Their work is interesting anyway, but I’d be curious to see what that imprecise-probabilistic angle is. This work also has some game-theoretic probability aspects to it, which itself is related to imprecise probability (see Chapter 6 of the Augustin et al text).

Statistics & machine learning problems

Above the focus was more on specific approaches to statistical inference, etc., but a good project topic would be to pick up an interesting, practically relevant statistics or machine learning problem and explore what’s been done with this in the literature, in particular, the imprecise probability literature. One general suggestion I can make is to browse some of the recent issues of the International Journal of Approximate Reasoning (the flagship journal of imprecise probability) to see what people are interested in; if you find a topic there that’s interesting, then you can use that paper to find some background and earlier references. Similarly, you can check out some of the recent ISIPTA (’19 and ’21) and BELIEF conference proceedings. Below is just a list of topics, some of them I know about and some of them I don’t. I write a little bit more below about those that I know something about.

Causal inference. At ISIPTA’21, several speakers emphasized the importance of causal inference and the promise that an imprecise-probabilistic approach might offer. I don’t know very much about this, but it’d definitely make a good topic for the project.
Clustering. Some discussion of the belief function approach to this can be found in Cuzzolin’s book. Definitely there’s ongoing work on this problem in the belief function community, so lots to read.
Classification. There’s discussion of this in Cuzzolin’s book and a chapter in Augustin et al. This is also a very active area of work, even something that I considered (using a very specific approach). Also, a former student of mine, Joyce Cahoon, had a chapter in her thesis with some preliminary work on the use of Walley’s imprecise Dirichlet model for classification of certain kinds of text, e.g., Tweets.
Prediction. This is obviously a super-important problem; classification is basically a special case. The conformal prediction framework of Vovk & Shafer has some imprecise-probabilistic aspects to it, but there’s a lot more beyond that. In particular, Thierry Denoeux has a very recent paper (to appear in this year’s BELIEF conference proceedings) about a neat approach based on “random fuzzy sets”. (Random fuzzy sets is a technically interesting and practically useful idea more generally, not just for prediction.)
Missing/censored/coarse data. This is a more classical problem in imprecise probability, so there’s an extensive literature. I’ve not dug into this much myself, but Couso & Dubois (IJAR 2018) is one reference I know of. There is an interesting aspect of this general setup that I want to explore with respect to something new that I’m working on, but I haven’t given it all that much thought yet.
Information fusion. This concerns the general problem of combining information from different sources, not unlike meta analysis if you’re familiar with that. Of course, one such strategy is Dempster’s rule of combination, but there are generalizations and other strategies entirely. A nice review paper on this is Dubois et al (2016). There was also a few papers published on this in the BELIEF 2021 conference proceedings.
Decision-making. There’s an excellent review paper by Thierry Denoeux, also a chapter in the Augustin et al text. I took a shot at decision-theory in my IM context here and I could talk more about that with you if you’re interested.
Reinforcement learning. I don’t know about this in the context of imprecise probability… but it’s obviously an interesting problem and has connections to, e.g., the dynamic treatment regimes applications mentioned briefly above.
Model assessment and selection. I don’t know much about this in the context of imprecise probability…
Transfer learning. I don’t know much about this, but I know it’s interesting and relevant…
Deep learning. I don’t know much about this, but I know it’s interesting and relevant… One paper that I encountered (indirectly suggested to me by Fabio Cuzzolin) is a recent paper on epistemic deep learning.
Data privacy. I don’ t know much at about this, except that my friend Ruobin Gong is working on it and she has at least some ideas about the role imprecise probability might play in this; she has a talk on YouTube about this and other aspects of imprecise probability in statistics.

Applications

This is a project direction that I can’t provide too much general guidance on. But if you have an application area of interest, then go ahead and see what imprecise probability work is out there. I might be able to help with some suggestions if you talk to me about the specifics you have in mind. Below is just a list of things that popped into my head as I was updating the is page.

Chapters 12–14 in the Augustin et al text consider some application areas. There are also some engineering-focused applications discussed in the Aslett et all book.
A nice application of Dempster–Shafer theory that I’m aware of is Edlefsen et al (AoAS 2009). Around that time, a hot topic was the Higgs boson and this spurred some interest in the general problem of estimating a genuine signal in the presence of some background signal; this is related to the more general problem of inference on parameters under non-trivial constraints, e.g., Mandelkern (Stat Sci 2002).
Aside from the more engineering-focused applications in the books listed above, I can imagine that there is lots of work on (or at least opportunities for work on) imprecise-probabilistic methods in economics/econometrics, psychology, astronomy, etc. Digging in to the literature on these applications and finding what has been done in stat/ML and what has or could be done with imprecise probability would make a nice project.
Actuarial science is another potential area of application. I know a little more about this than the others, which is why I separated this out in my list. My assessment is that “Bayesian” is the preferred approach in actuarial science, but I could be wrong. A reason for preferring the Bayesian formulation is that it can make some claims about prediction rules being “optimal” — prediction is a critical problem because insurance companies want (and need) to know, e.g., roughly how much they’ll have to pay out in claims next year. But the Bayesian-style optimality obviously assumes that the prior is known/given, so there are immediate opportunities to generalize to imprecise probabilistic-solutions. I have a recent paper where my co-author and I incorporate some imprecise-probabilistic ideas into the classical framework of credibility theory.
Of course, this is not an exhaustive list…

Computation

As I’m sure you already know, computation can be a challenge in the context of imprecise probability. But rather than being discouraged by this, I hope you see it as an opportunity to develop something new and impactful. Bayesian statistics faced the same challenges prior to the 1980s, but then some new ideas (and technological advances) totally changed the game. Now fitting very complex, high-dimensional Bayesian models is almost trivial. Some of those same advances can be used for imprecise probability, but there are some additional challenges, e.g., evaluating Choquet integrals involves both integration and optimization.

There is a chapter on computation in the Augustin et al book and in the Aslett et al book. Also, I know that computation has been a hot topic in the last two ISIPTA meetings, and probably has been for some time. So you could check out some of the recent conference proceedings papers for some more modern references. A former student and I have a paper in the ISIPTA’21 proceedings about Monte Carlo/stochastic gradient descent ideas for doing certain IM-related computations. That paper isn’t all that remarkable, but it could be a good starting point for general computational strategies for the latest version of the IM framework. There was also the recent JASA discussion paper on Gibbs sampler-based techniques for computations in the Dempster–Shafer framework.

Imprecise-probability proper

Imprecise probability, like ordinary probability, is a topic in its own right, so there are interesting questions that don’t have anything directly to do with statistics, machine learning, etc. The recent journal issues and conference proceedings suggested above would be good references to see what kinds of theoretical questions people are studying these days. Below are a few topics that I think would make interesting projects.

Imprecise Markov chains and other stochastic processes
Convergence properties of imprecise probabilities
Deeper dive into random set theory (e.g., convergence properties)
Conditioning/updating operations, contraction & dilation
Versions of Dempster’s rule for dependent bodies of evidence
Notions of independence and correlation in imprecise probability
There are some other kinds of imprecise probability models that we didn’t explore in the class, for example, p-boxes and clouds. Chapter 4 in the Augustin et al text gives a brief introduction to these models. I’m personally interested in p-boxes but haven’t had time to dig into this too carefully. A comprehensive reference to p-boxes that I can suggest is this SANDIA tech report by Ferson et al.
…