Parametric Families of Probability Distributions for Functional Data Using Quasi-Arithmetic Means with Archimedean Generators

Parametric probability distributions are central tools for probabilistic modeling in data mining, and they lack in functional data analysis (FDA). In this paper we propose to build this kind of distribution using jointly Quasi-arithmetic means and generators of Archimedean copulas. We also define a density adapted to the infinite dimension of the space of functional data. We use these concepts in supervised classification. 1. QAMML distributions Let (Ω,A, P ) a probability space and D a closed real interval. A functional random variable (frv) is any function from D×Ω → R such for any t ∈ D, X(t, .) is a real random variable on (Ω,A, P ). Let L(D) be the space of square integrable functions (with respect to Lebesgues measure) u(t) defined on D. If f, g ∈ L(D), then the pointwise order between f and g on D is defined as follows : ∀t ∈ D, f(t) ≤ g(t) ⇐⇒ f ≤D g. (1) It is easy to see that the pointwise order is a partial order over L(D), and not a total order. We define the functional cumulative distribution function (fcdf) of a frv X on L(D) computed at u ∈ L(D) by : FX,D(u) = P [X ≤D u]. (2) To compute the above probability, let us remark that, it is easy to compute the probability distribution of the value of X(t) for a specific value of t, and this for any t ∈ D. Then we define respectively the surface of distributions and the surface of densities as follow : G : D × R → [0, 1] : (t, y) 7→ P [X(t) ≤ y] (3) g : D × R → [0, 1] : (t, y) 7→ ∂ ∂t G (t, y) (4) We can use various methods for determining suitable g and G for a chosen value of X. Thus for example, if X is a Gaussian process with mean value μ(t) and standard deviation σ(t), then, for any (t, y) ∈ D × R, we have : G (t, y) = FN (μ(t),σ(t))(y) and g (t, y) = fN (μ(t),σ(t))(y). In the following we will always use the function G with a function u of L (D), so, for the ease of the notations, we will write : G [t;u] = G [t, u (t)]. We will use the same notation for g. In what follows we define our parametric families of probability distributions. Let X be a frv, u ∈ L(D) and G its Surface of Distributions. Let also φ be a continuous strictly decreasing function from [0, 1] to [0,∞] such that φ(0) = ∞, φ(1) = 0, where ψ = φ must be completely monotonic on [0,∞[ i.e. (−1) d k dtk ψ(t) ≥ 0 for all t in [0,∞[ and for all k. We define the Quasi-Arithmetic Mean of Margins Limit (QAMML) distribution of X by : FX,D(u) = ψ [