Laboratory of Structural Methods of Data Analysis in Predictive
Modeling Moscow Institute of Physics and Technology

Advances in Optimization and Statistics

Dorn Yuriy
Должность: Младший научный сотрудник
Gasnikov Alexander
Должность: Ведущий научный сотрудник
Dvurechensky Pavel
Должность: Ведущий научный сотрудник
Krymova Ekaterina
Должность: Младший научный сотрудник
Panov Maxim
Должность: Младший научный сотрудник
Klochkov Egor
Должность: Младший научный сотрудник
Даты проведения и дополнительная информация:
Дата Дополнительная информация


10:00 – 10:35
Asymptotics beats Monte Carlo: The case of correlated local vol baskets
Christian Bayer
We consider a basket of stocks with both positive and negative?weights, in the case where each asset has a smile, e.g., evolves?according to its own local volatility and the driving Brownian motions?are correlated. In the case of positive weights, the model has been?considered in a previous work by Avellaneda, Boyer-Olson, Busca and?Friz [Risk, 2004]. We derive highly accurate analytic formulas for the?prices and the implied volatilities of such baskets. These formulas?are based on a basket Carr-Jarrow formula, a heat kernel expansion for?the (multi-dimensional) density of of the asset at expiry and the?Laplace approximation. The formulas are almost explicit, up to a?minimization problem, which can be handled with simple Newton?iteration, coupled with good initial guesses as derived in the paper.?Moreover, we also provide asymptotic formulas for the greeks.?Numerical experiments in the context of the CEV model indicate that?the relative errors of these formulas are of order $10^{-4}$ (or?better) for $T=\frac{1}{2}$, $10^{-3}$ for $T=2$, and $10^{-2}$ for?$T=10$ years, for low, moderate and high dimensions. The computational?time required to calculate these formulas is under two seconds even in?the case of a basket on 100 assets. The combination of accuracy and?speed makes these formulas potentially attractive both for calibration?and for pricing. In comparison, simulation based techniques are?prohibitively slow in achieving a comparable degree of accuracy. Thus?the present work opens up a new paradigm in which asymptotics may?arguably be used for pricing as well as for calibration. (Joint work?with Peter Laurence.)

10:35 – 11:10?
Brownian motion approach for spatial PDEs with stochastic data
Marcel Ladkau
We deal with PDE problems of elliptic type that are classically solved by the finite element method in the case of deterministic data.?Via the Feyman-Kac formula we give a stochastic representation providing us with?point-wise solutions. We show how to use the stochastic representation to generate a spatial approximation of the solution. Later we?extend this to the case of stochastic data and discuss the advantages?compared to FEM for such situations. Our main application will be Darcy's law.

11:10 – 11:40

11:40 – 12:15?
Semiparametric Bernstein - von Mises theorem: non-asymptotic approach.
Maxim Panov
Classical asymptotic Bernstein-von Mises theorem will be reconsidered in non-asymptotic setup. Special attention will be paid to applicability of the results in case of growing parameter dimension and to the notion of effective dimension. The results for growing dimension will be extended to the semiparametric estimation with nuisance from Sobolev class. General results will be accompanied by particular examples illustrating the theory.

12:15 – 12:50?
Conditional moment restrictions estimation
Nikita Zhivotovskiy
We are interested in statistical models where parameters are identified by a set of conditional
estimating equations (or moment restrictions). Using modern tools proposed by Spokoiny (2011) we will reconsider the properties of the estimator in generalised method of moments and derive Wilks expansion for this model.
All the results are non-asymptotic and stated for a deterministic design.

12:50 – 14:20?

14:20 – 14:55?
Additive Regularization for Probabilistic Topic Modelling
Konstantin Vorontsov
Probabilistic topic modeling is a powerful tool for statistical text analysis, which has been recently developing mainly within the framework of graphical models and Bayesian inference. We propose an alternative approach - Additive Regularization of Topic Models (ARTM). Our framework is free of redundant probabilistic assumptions and dramatically simplifies the inference of multi-objective topic models. Also we hold a non-probabilistic view of the EM-algorithm as a simple iteration method for solving a system of equations for a stationary point of the optimization problem.

14:55 – 15:30?
Stochastic online gradient-free method with inexact oracle and huge-scale optimization
Gasnikov Alexander
In the talk we generalize results by Nesterov Yu. Random gradient-free minimization of convex functions. CORE Discussion Paper 2011/1. 2011.
First of all we consider online (two-point) case. Moreover we consider general prox-case, that is we don't resrtrict ourself euclidian structure. But, the main ingridient of our generalization is assumption about inexact zero order oracle (this oracle return the value of the function). And the nature of inexactness has not only stochastic nature, but also determinated part. This guy leads to the bias in the stochastic gradient approximation due to the finite difference. The unpredictable and rather unexpected result is that we have the same estimation the complexity of properly modified mirror descent algorithm for inexact oracle with level of noise is no more than desirable accuracy (on function). All the results obtained in terms of probabilities of large deviations. Moreover, proposed algorithms reach unimprovable estimotion of complexity to within a multiplicative constant. This algorithms can be used in special class of huge-scale optimization problem. One of such problems (comes from Yandex) we plane to describe briefly if we have enough time. Joint work with Lagunovskaia Anastasia.

15:30 – 16:05?
Еxponential weighting in the presence of colored noise.
Ekaterina Krymova
We present new oracle inequalities for exponential aggregation of regression function estimates in assumption of heteroscedasic Gaussian noise.

16:05 – 16:35?

16:35? - 17:10
Quantification of noise in MR experiments
Joerg Polzehl
We present a novel method for local estimation of the noise level in?magnetic resonance images in the presence of a signal. The procedure? uses a multi-scale approach to adaptively infer on local neighborhoods? with similar data distribution. It exploits a maximum-likelihood? estimator for the local noise level. Information assessed by this? method is essential in a correct modeling in diffusion magnetic ?resonance experiments as well as in adequate preprocessing.? The validity of the method is evaluated on repeated diffusion data of?a phantom and simulated data. We illustrate the gain from using the?method in data enhancement and modeling of a high-resolution diffusion? data set.

17:10 – 17:45?
Simultaneous Bayesian analysis of contingency tables in genetic association studies
Thorsten Dickhaus
Genetic association studies lead to simultaneous categorical data analysis. The sample for every genetic locus
consists of a contingency table containing the numbers of observed genotype-phenotype combinations. Under
case-control design, the row counts of every table are identical and fixed, while column counts are random. Aim of
the statistical analysis is to test independence of the phenotype and the genotype at every locus.

We present an objective Bayesian methodology for these association tests, utilizing the Bayes factor F_2 proposed
by Good (1976) and Crook and Good (1980). It relies on the conjugacy of Dirichlet and multinomial distributions,
where the hyperprior for the Dirichlet parameter is log-Cauchy. Being based on the likelihood principle, the
Bayesian tests avoid looping over all tables with given marginals. Hence, their computational burden does not
increase with the sample size, in contrast to frequentist exact tests.

Making use of data generated by The Wellcome Trust Case Control Consortium (2007), we illustrate that the
ordering of the Bayes factors shows a good agreement with that of frequentist p-values.

Finally, we deal with specifying prior probabilities for the hypotheses, by taking linkage disequilibrium structure
into account and exploiting the concept of effective numbers of tests (cf. Dickhaus (2014)).

17:45? – 18:20?
Computation of an effective number of simultaneous X^2 (Chi Square) tests
Jens Stange
Common X^2 tests are very well known and frequently applied in statistical analyses in particular for discrete models. An application to genetic association studies is considered, where a large number M, say, of 2x3 contingency tables is simultaneously tested. A method controlling the family wise error rate is shown, which makes use of an effective number of tests in Sidak multiplicity correction favor. This method considers an approximation of the full M-dimensional distribution of the involved X^2 test statistics, by a product of k-dimensional marginal distributions. A challenge of this procedure is an efficient computation of the k-dimensional distributions. Besides time consuming Monte Carlo procedures, there are only few implementations for even smaller dimensions of multivariate distributions. Existing formulas for the cumulative distribution function of a multivariate X^2 distribution are now implemented for an approximations with k equal to up to 4.



10:00 – 10:35
Oracle-type posterior contraction rates in Bayesian inverse problems
Peter Mathe
We discuss Bayesian inverse problems in Hilbert spaces. The ?focus is on a?fast concentration of the posterior probability around the unknown?true solution as expressed in the concept of posterior contraction?rates. Previous results determine?posterior contraction rates based on known solution smoothness. Here?we show that an oracle-type parameter choice is possible. This is done?by relating the posterior contraction rate to the root mean squared ?estimation?error. The talk is based on joint work with K. Lin and S. Lu, Fudan ?University, Shanghai.

10:35 – 11:10?
Pathwise stability of likelihood estimators for diffusions via rough paths
Hilmar Mai
We consider the estimation problem of an unknown drift parameter within classes of non-degenerate diffusion processes. ?The Maximum Likelihood Estimator (MLE) is analyzed with regard to its pathwise stability properties and robustness?towards misspecification in volatility and even the very nature of noise. We construct a version of the estimator based on? rough integrals (in the sense of T. Lyons) and present strong evidence that this construction resolves a number ? of stability issues inherent to the standard MLEs. We will also discuss some numerical examples to demonstrate the relevance of our results in a finite sample setting.

11:10 – 11:40?

11:40 – 12:15?
Intermediate gradient method for convex problems with stochastic inexact oracle.
Pavel Dvurechensky
In this talk we consider an another step in the direction of universalisation of methods for convex optimization problems. In a number of recent articles by O. Devolder, F. Glineur and Yu. Nesterov and O. Devolder's PhD thesis authors considered convex optimization problems with inexact oracle in the sence of deterministic and stochastic errors. It is shown in these works that dual gradient method has slow convergence rate but doesn't accumulate the deterministic error of the oracle. On the contrary the fast gradient method has faster convergence rate but linearly accumulates the deterministic oracle error. They proposed an intermediate gradient method (IGM) for inexact oracle. This method allows to play with tradeoff between the rate of convergence and error accumulation depending on the problem parameters. In our work we introduce a new IGM which can be applied to the problems with composite structure, stochastic inexact oracle and non-euclidean setup. We provide an estimate for mean convergence rate and bound large deviation from this rate.
Joint work with Alexander Gasnikov.

12:15 – 12:50?
The uniqueness of equilibrium in Nesterov-de Palma model
Dorn Yuriy
In this talk we give a simple criterion for the uniqueness of equilibrium in Nesterov-de Palma model. We also propose new method for computing equilibrium in Nesterov-de Palma model.

12:50 – 14:20?

14:20 – 14:55?
Finite sample analysis of semiparametric M-Estimators
Andreas Andresen
Semiparametric Models are characterized by an infinite dimensional parameter, while the target of estimation is only a finite - often low - dimensional. A prominent example is the estimation of a finite dimensional projection of the full parameter via an M-Estimator, as for example the profile Maximum Likelihood Estimator (pMLE). Despite the full model being nonparametric root n rates can be attained for such estimators. The semiparametric Wilks and Fisher Theorems show that the semiparametric log likelihood quotient is asymptotically chi square distributed - the degrees of freedom equal the dimension of the target parameter - and that the pMLE is semiparametrically efficient. We present a method how to extend these results to a non asymptotic setting and how to obtain explicit bounds for the "small terms". This allows to determine for a broad class of models critical ratios of the full dimension to the sample size in the context of sieve estimators. The results are illustrated with the single index model.?

14:55 – 15:30?
Model selection by Lepski's method for penalized likelihood
Niklas Willrich
We explore the application of Lepski's method in a finite sample ?framework to a penalized likelihood and compare the results to oracle ?bounds.

15:30 – 16:00?

16:00 – 16:35
Semiparametric estimation in errors-in-varaibles regression
Egor Klochkov
We consider errors-in-varaibles linear regression problem with non-linear regressor functions. To define smoothness of target function and design sieve approach is used. We apply semiparametric approach with unknown design considered as nuisance parameter.

16:35 – 17:10?
Variable Selection in Cluster Analysis by Means of Bootstrapping
Hans-Joachim Mucha
Variable selection is a difficult task in many areas of multivariate statistics such as classification, clustering and regression. Here the hope is that the structure of interest may be contained in only a small subset of variables. In contradiction to supervised classification such as discriminant analysis, variable selection in cluster analysis is a much more difficult problem because usually nothing is known about the true class structure, and hence nothing is known about the number of clusters K to be inherent in the data.
There are many proposals on variable selection in cluster analysis based on special cluster separation measures such as the criterion of Davies and Bouldin (1979). Here we present a general bottom-up approach to variable selection using non-parametric bootstrapping based on criteria of stability such as the adjusted Rand's index (Hubert and Arabie, 1985). General means that it makes use only of measures of stability of partitions, and so it can be applied to almost any cluster analysis method.

17:10 – 17:45?
Asymptotic behavior of tensor product splines under various smoothing methods
Michail Belyaev
Classical multivariate regression problem with factorial design of experiments will be considered. Recent applied problems require more flexible smoothing capability than standard algorithms provide. Also there is a strict requirement for computational efficiency. These concerns lead to developing of new smoothing algorithms. Asymptotic properties of such algorithms will be studied.